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ABSTRACT % * 

\ , The major purppse*of this study was to compare 

self- instructional mastery and nonmastery treatments to determine if 
there are differences in learning, retention, and time-to-testing of' 
high, middle,, and low aptitude students. Twenty grade 7 classes from 
; the .Savannah^Chatham County School District served as the 
experimental jpppulatipn. Students were tested for placement in. one of 
three levels of aptitude; then, classes wete randomly assigned to two 
groups and treatment was randomly assigned to groups* Ttie nonmastery 
treatment received a student text and a workbook which contained 
prescribed activities and a single review test for each chapter,. The 
mastery treatment received the same student i^ext; however, the 
chapters in the workbook contained two review ^ tests; If the criterion 
level was not attained in the first review test, mastery students 
were required to correct and relearn material and then tak£ a- second 
review" test. A multiple choice test andjrecallj test was administered 
to measure learning and retention of the .content materials. Findings 
showed that differences in aptitude wer4/not reduced when 
self- instructional materials were used. /An implication of this study 
is, however, that the lack of teacher mpnitpring in administering jfcfce. 
review tests may have contributed to x the poor performance cf low' 
aptitude students, since typically low altitude students require 
close supervision. (Author/ND) 
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FOREWORD 

This study Was undertaken as part of the continuing research and 

development of the Geography Curriculum Project, University of Georgia. 

i 

The xontent focus of the Geography Curriculum Project" is the 
preparation of supplementary units for the elementary grades, 
emphasizing the organizing -concepts of the discipline of geography. 
The research focus o is the testing of some psychological construct of 
learning, such as the nature of concepts, Ausubel 's reception learning 
model, Bloom's mastery learning, or Bruner's discovery hypothesis, 
under normal conditions of school instruction. 

The Geography Curriculum Project thus serves as a small research 
and development center. It develops new materials and measurement 
instruments, field tests <^nd evaluates materials, and facilitates the 
training of doctoral students in geographic education. 

The Geography Curriculum Project was initiated as a result of a 
study of geographic content in elementary social science texts, 
manuals, and study guides. The evidence indicated that elementary 
geography is primarily presented as a discrete body of facts, with 
little attention to the organizing concepts of geography which help to 
analyze, interpret, and integrate physical and cultural phenomena. The 
development of systematic geography upits helps to clarify the teaching 
of geographic knowledge and concepts. The research emphasis answers 
questions relating to the structuring of materials and their use in 
teaching geography. 
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CHAPTER I 
- BACKGROUND TO THE STUDY 

A continuing educationaVchallenge is how to organize instruction 
in schools to facilitate/a high level of learning for the majority of 
students. This problem is the core of pedagogy - how to help students 
learn more in a^given time - and might be regarded as the departure 
point for the /development of a science of instruction and learning. 

In the decade of the 60s* this challenge of organizing instruction 
to facilitate learning assumed a new urgency with the re-discovery of 
the disadvantaged learner. Under the slogan of "compensatory 
education, 11 a variety of programmatic attempts have been made to 
overcome the learning deficits of the slow learner, especially learning 
deficits which might be attributed to a disadvantaged environmental - • 
background. ' l\ \ 

Success in school subjects is now regarded not merely as a matter 
of school achievement but of personality and social adjustment as v/elK 
Low school performance is cumulative. Consequently, low performing 
students are seTdom able to overcome learning deficits. Continual low 
performance reduces a student's desire/for further learning (Sears, 
1940) andjlevelops undesirable attitudes toward learning (Khan, 1969). 
In turn, these traits lead to the development of poor self-concept 
(Torshen<1969) and possibly mental health problems (Stringer and 
Glidewell, 1967). Some critics, such as Block (1971) allege that as 
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few as one-third of the students have successful and rewarding learning 

experiences -under traditional assign-recite-test procedures adapted to 

I 

the class mean. 

Mastery learning has been proposed as a teaching-learning 
procedure that may substantially increase the proportion of students' 
enjoying successful and rewarding school learning experiences. Mastery 

learning is a term coined by Bloom who contends that " all or 

almost all students can master what they are taught." In contrast to 
programmed instruction designed for individual self-instruction, 
feedback, and re-learning, Bloom's mastery learning envisions the use 
of- procedures "whereby each student's instruction and learning, can be 
managed within the context of ordinary group-based classroom instruc- 
tion , as /to promote his fullest development." 

Bloom not only proposes mastery learning as an alternative which, 
will give 'lower performing students the necessary additional time to 
learn, but he even alleges that mastery procedures will minimize 
differences in achievement resulting from differences in aptitude. He 
claims that as many as ninety-five per cent of the school population 
can leani most of the material to a stipulated criterion level provided 
they are given sufficient time and adequate correction and feedback. 
Mastery procedures will not be effective for five per cent of the 
population because of innate learning disabilities (Bloom, 1968). ^ 

The Bloom hypothesis that mastery learning procedures can overcome 
aptitude differences is contrary to the mass of psychological evidence' 
whicb indicates that Most treatments are insufficient to overcome 
differences in aptitude (DeCecco, 1968) and that methods of teaching 

y / 
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share the common result of ineffectiveness (Wallen and Travers, 1963). 
In the research of the. Georgia Anthropology and Geography Curriculum 
Projects, the evidence consistently shows that aptitude, as measured by 
reading test scores * is a more/ significant learning variable than 
methods of treatment (Steinbrihk, 1970; Freeh, 1973; Dumbleton, 1973). 

Furthermore, Bloom mastery procedures are class-paced rather than 
individual -paced mastery. In the Bloom procedure, the progress of the 

.higher aptitude student is retarded by the withholding of additional 
learning tasks- Instead, he serves as a tutor or teacher aide to 
assist the lower performing and slower .student. In contrast, in 
individual-paced instruction, whether of the earlier Winnetka type 
(Washb'urne, 1922) or the more recent IPI-type (Glaser, 1968), the 

Jn'gher aptitude student has consistently achieved at a higher 
performance level and completed more units of study. 

In a class-paced mastery procedure, as proposed by Bloom, low 
achieving students attain the criterion level attained by high 
achieving .students. But the increase in achievement by low aptitude ' 
students fs attained at the cost of two trade-offs which may not be 
educationally desirable. One is the slow-down in the achievement pace 
of the high aptitude student. This' use of high aptitude talent to j 
assist low achievers might, in the long ; run, constitute a waste o^V 
educational talent. The short-term run of most mastery studies thus 
far, however, neither provide the evidence for the abuse of high / 
aptitude student talent nor the long-term efficacy of mastery 
procedures for low aptitude students. • , 

The second trade off is in the amount of time required to attain 



00017 



4 

the criterion level established' for "mastery." The provision of extra 
learning time for the low aptitude student may provide a substantial 
learning difference. 

One of the alleged advantages of mastery is that while the 
procedure may be initially slower, the thorough learning of content and 
procedures facilitates subsequent learning. This claim may hold some 
merit for hierarchically organized subjects, such as mathematics or 
foreign languages, but may not be true for subjects, such, as the social 
sciences, in which the complexity of the subject matter appears to be 
primarily a function of factual, conceptual, and syntactical complexity 
rather than the sequencing of learning hierarchies. 

The social studies contain learning clusters based on the concepts 
and facts being presented, but their sequencing, however logical, 
appears to be arbitrary. For example, in both the Anthropology and 
Geography Curriculum Projects at the University of Georgia several 
topical alternatives were considered in the sequencing of the content. 
In mathematics, foreign languages, accounting, and shorthand, in 
contrast, there are generally agreed on progressions of presentation 
moving from the simple to the more complex. Mastery procedures may 
facilitate subsequent learning in elementary arithemetic but mastery 
procedures may not transfer to elementary history, because new factual 
and conceptual material is largely discrete. 

Thus in the social studies it might be possible. to attain mastery 
over a portion of the material to be covered, but/ttjis intensive 
coverage is attained at the expense of a more extensive treatment. 
Time to teach and learn in a school setting is limited. Consequently, 



it is|not educationally desirable to igjiore the amount of time required 
to achieve a given task. In the Carroll model of school learning 
(Carroll, 1963), aptitude is a function of the time taken to learn. 
Consequently, any investigation of mastery learning must take into 
account the time students take to achieve mastery. Time is thus not 
only a contextual variable, but it may also be regarded as an important 
treatment variable. - \ 

Research in mastery learning to this date has not systematically 
examined \he various variables implicit in any learning system. Rice 
(1973) identified seven independent variables and four dependent 
variables which require systematic examination to establish a body of 
evidence to substantiate the allegations of mastery learning. Gener- 
ally, mastery learning has been. presented as a panacea (Block, 1971) 
with an overgeneralization and statement of claims. In a critical 
analysis of the state of the art and quality of research, Mitchell 
('1974, in draft) concluded that much mastery learning research is based 
on crude comparisons of a mastery group with a non-mastery group, often 
with ex post facto comparisons. Thus, while mastery learning procedures 
have generally been reported as superior to non-mastery procedures 
(Kim, 1969, 1970; Block/ 1970; Lee, 1971), it is extremely difficult to 
assess the results of such research. The reader is left with the 
feeling that many comparisons of mastery with non-mastery procedures 
are merely comparisons of superior with inferior instruction, or may 
result from the halo effect of experimental treatment. 

In selecting a focal. point for this study in mastery learning, it 
\as decided to design a study which would give importance to the 
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aptitude variable in mastery learning. This question appeared to be 
crucial / for, as the review ofj the literature in Chapter II shows, there 
appears to be a tendency to make claims for mastery learning which are 
not substantiated by the evidence. 
General Statement of the Problem 

The central question t\ris study addresses itself to*is this: If a 
mastery procedure is used in^ teaching a geography unit at the grade \ 
seven level, will the average achievement of students at three levels of 
aptitude be significantly different? 

. Three aptitude levels were arranged using the word meaning section 
of the Iowa Tests of Basic Skills : Forms 5 and 6 (Lindquist and 
Hieronymus, 1971) as the concomitant variable. A high, middle, and low 
group were formed. Since achievement may be measured, in terms of 
learning, as assessed by immediate posttest, and by retention, as 
measured by a delayed posttest, it was decided to measure both learning 
and retention to see if mastery procedures might demonstrate, superiority 
with a time interval in testing. The treatment consisted of a self- . 
instructional geography text and workbook Functions of Cities , 
Publication No. 74-1, Geography Curriculum Project, JUniversity of 
Georgia. 

In all teaching, the classroom unit of instruction appears to be 
crucial in educational research. Since educational Researchers 
typically must use intact classes rather than randomize assignment of 
students to treatment, the research design must take into account the 
classroom^ and ^teacher variable. In order to minimize teacher ^f feet, it 
was decided to use self-instructional materials. But since students 
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work differently under different teachers, it was considered necessary 
that the data analysis take into account the classroom variable. 

A second aspect of the Bloom hypothesis impli.city relates to the 
time variable. Given enough time and proper feedback, mastery 
procedures allegedly overcome aptitude differences. But if high 
aptitude students are able to continue to work at learning tasks, not 
limited to tasks which are paced to the slower learner, would not higher 
aptitude students not only cover more, put achieve at a higher level? 
Definition of Terms 

For the purposes, of this study, the following terms were used: 
Mastery Learning is used in accordance with general usage to 
describe a teach-test-reteach strategy. There are no set procedures for 
mastery learning. There are two major patterns— group-paced, sometimes 
called the Bloom model (1968), and individual paced, sometimes called 
the Keller model (1968)'. The operational characteristics, however, of 

< 

any mastery .treatment vary with the procedures stipulated by the 
investigator. In this study, the mastery procedures include diagnosis, 
correction, and restudy after the administration of -two review tests. 
After tompletion of the second review procedure, the mastery students 
were permitted to continue to the next unit, even without attaining the 



criterion. Since the operational procedures are discussed at length in 
Chapter III, pp. 51-54, the specific procedures will not be developed at 



this point. I 
-ma 



in\. 

Non- iastery learning is a general term used to describe teaching- 
learning procedures which do not provide systematic feedback and 
opportunity for a student to restudy and learn the subject matter to a 
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specified criterion. Any kind of instructional procedure, group or 
individual, class paced or personalized, structured or unstructured, 
open or closed, may be used as a non-mastery procedure. 

In this study, non-mastery procedures include the use of a 
structured text with accompanying workbook, review test, and class 
discussion. These procedures, as described in Chapter III, are part of 
the self-instruction also administered to students in the mastery group. 
In order for a comparison of mastery and non-mastery procedures to be 
carried . out each procedure must be carefully designed and adhered to. 
In addition, the content should be identical. The only differences- in 
the organization of the content should be those differences which are 
essential in making the treatments distinct. The critical difference in 
the mastery and non-mastery treatments,, as stipulated' in this study, is 
the requirement that mastery students restudy material and attain a 
specified criterion, 85 per cent, before proceeding to the next unit. 
The non-mastery treatment, in contrast, does not provide additional time 
for restudy and learning. 

Aptitude , in this study, was used to describe a level to which a 
student was assigned as measured indirectly by the word meaning section 
of the Iowa Tests of Basic Skills : Forms 5^ and 6^ (Lindquist and 
Hieronymus, 1971), It refers to a student's capacity or talent to learn 
or understand. Correlation of student performance and an indirect 
measure such as an aptitude word-meaning test have proven to be high 
(Thomas, 1967; Gaines, 1971; Dale, 1972; Pelletti, 1973), and as such, 
are good predictors of scholastic aptitude. 

Learning is the knowledge and application of facts, concepts, and 
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* generalizations acquired as a result of study in one of the treatment 
groups as measured by a posttest directly related in content to the 
cognitive .objectives of the materials, administered -Jmmediately upon 
conclusion of the treatment period, 1 Knowledge, as used in this - 
definition, is used in the general sense of knowing ( Webster's Third New 
International Dictionary , 1971), and is not to be construed in the 
limited sense of knowledge implied by the Bloom taxonony (Bloom, 1956), 

Rete ntion is the amount of knowledge retained as a result of 
studying in one of the treatment groups as measured by the same form of 
a posttest for learning administered as a delayed posttest. 

Times-to-testing is the mean classroom- elapsed time taken by 
students in each cell to complete or partially complete the treatment 
materials. 

Criterion level is a score which mastery students must reach on a 
unit review test ih order to proceed to the next unit. The eighty-five 
per cent level was used. as the criterion 'level in this study. This 
criterion was selected because the studies of different criterion levels 

^ cited in Chapter III indicate that the 85 per cent level is sufficiently 
high to encourage a greater quality of learning * but not too high to be 
di scourging v especially to the lower aptitude student. 

\ Review test is a test administered to each student at the 

completion of each chapter. Review tests were used as' an indication of 
the quality of learning to students and as a reference for reviewing 
poor quality learning. Both mastery and non-mastery students completed 
the first review test but at its conclusion non-mastery students 
proceeded to the next chapter of work, while mastery students, who did 
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not reach criterion, restudied the text and workbook exercises. When 
they finished restudying, they took a second review test. This" test 
contained the original items, however the items were reordered. Chapter 
III, pp. 58-60 contains a more complete explanation of the review tests. 
The term 'review test' has been used in this study in lieu of the Bloom, 
Hasting and Madaus (1971) term of 'formative' evaluation. However, their 
meanings are not synonymous. 

This discussion of terminology is pertinent to the review of the 
literature, the subject of the next chapter, and to the methods and 
procedures of writing the treatment materials and preparing the 
measuring instruments, presented in Chapter III. 
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CHAPTER II 
REVIEW OF THE LITERATURE 

The present study was designed to compare the average achievement 
levels of mastery and non-mastery procedures of high, middle, and low 
aptitude students, using measures of learning, retention, and times-to- 
testing. Students bring a wide range of aptitudes to each learning 
experience* It is the hope of teachers that students learn and retain 
learning to a high degree. A teaching-learning procedure that 
facilitates the learning expectations of teachers for students of 
varying aptitudes would offer a valuable contribution to education. 
However, if such a procedure were to require more, learning time the 
economics of class learning interacting with the many school subjects 
might be disadvantageous. '" 

Three independent variables were used in this study. They were 
1) treatment (mastery and non-mastery); 2) aptitude (high, middle, and 
low); and 3) class (10 classes for treatments). Three dependent 
variables were used. They were: 1) learning (Geography Achievement 
Test, posttest); retention (delayed posttest); ; and 3) times-to- testing 
(elapsed classroom time). / 

The discussion of the literature will-focus on the independent and 

/ 

dependent variables to be used in this study. Therefore, the followina 
organization was used: 1) antecedents of mastery learning; 2) compari 
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sons of learning by mastery and non-mastery procedures; 3) aptitude,; 
4) retention; 5) times-to- testing; and 6) mastery learning and the 
social sciences. 

Antecedents of Mastery Learning 
Very few ideas in education today are without a firm base in 
earlier pedogogy. Benjamin Bloom's (1968) mastery learning strategy is . 
no exception. Prior to Bloom's publication several notable attempts 
were made in the United States to develop systematic teaching-learning 
strategies. Among the systems devised were those of Washburne, Morrison 
and Skinner. However, it was Carroll's (1963) Model of School learning' 
that provided the theoretical background for the concept of "mastery." 
Washburne's (1922) work with the Winnetka School System -in Chicago was 
one of the first of note. The Winnetka Plan aimed to individualize 
pupil instruction^ by building a curriculum in which time was varied and 
achievement was constant. This required that subject matter objectives 
be clearly stated, instructional materials be sequential, appropriate 
criterion levels be fixed, diagnostic-progress tests be constructed, and 
supplementary self-instructional materials^ be designed. 

The results of experiments conducted'at Winnetka indicate that 
pupils* in the individualized program did not achieve any higher than 
-pupils in conventional classrooms. However, the individualized program 
did appear to reduce the amount of time the pupils spent in learning 
(Washburne, Vogel and Gray, 1926). 

Morrison (1926) developed a strategy similar to that of Washburne 
using students at the Laboratory School of the University of Chicago. 
He* developed the strategy of, "Prertest, teach, test the result, adapt 

* 
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procedures, *„ach and test again to the point of actual. learning, 
(p. 79)." The Morrison model was based on the premise that learning 
was attainable given enough time and proper instruction. Morrison 
stressed that reteaching procedures should reflect careful decision 
making on the part of the teacher after he had reviewed the results of 
student tests. The Morrison approach specifically called for test 
results to act. as the focusing agent for both ^fcudent and teacher when 
further instruction was under consideration. 

Washbume's and Morrison's strategies did not appear to be 
favorably received within the field of social studies. Boyington 
(1932) land Boten (1932) contributed the only reported research found in 
the field. They* developed diagnostic tests for detecting weaknesses in 
the teaching and learning of social studies content. It would appear 
at this juncture that tha strategies developed by Washburne and 
Morrison did not achieve favor due to the development of other 
strategies, such as problem solving. 

• . The "teach, test, reteach n K strategy did not resurface until the 
late 1950s and early 1960s. Skinner (1954) revived them through his 
development of programmed instruction. The principal idea of 
progranmed instruction was that learning of any behavior, no matter how 
complex, rested upon the learning of a sequence of less complex 
component behaviors. Programmed instruction operationalized Skinner's 
stimulus - response learning theory and it appeared to facilitate 
learning for those students who required small learning steps, drill, 
and frequent reinforcement. However, it did not facilitate learning 
for all or almost all student^. Carroll's (1963) 'Model of School 
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^earning 1 attempted to, fi\y this gap. 

Essentially, Carroll's model was a conceptual paradigm that out- 
lined factors influencing and interacting to produce student success in 
school lea^ning^ — -In^its simplest form, his model ^proposed that if each 
student was allowed the time he needed to learn some stipulated 

9 

\ * 

criterion level and he spent the required learning time, then he could 
expect to attain that level* If the* student was not allowed sufficient 
time, then the degree to which he could expect to learn was a function 
of the ratiojof time actually spent in learning to time needed:. 

^ , . (/time actually spent) 
Degree, of learning = A — -~ — i- 

* / (time needed) 

Carroll's model conceived of school learning as consisting of a 
series of distinct learning tasks. In each task, the student proceeds 

" from ignorance pf some specific fact or concept to knowledge or 

understanding of it or .... from incapability of performing some act to 
capability of performing it (Carroll, 1963, p. 723)."' The" model, pro- 
posed that under typical school learning .conditions, the time spent and 
the time needed were functions of certain characteristics of v the indivi- 
dual and his instruction. The time spent v/as determined by the amount 
of time the student was willing to spend actively engaged in learning 
and the total learning time, he was allowed. The learning time each 
student required was determined by his aptitude for the task, the qua- 
lity of instruction, and the student's ability to understand instruction. 
These are the factors that specify the sources of variation that should 
•be included in the model and which have been used as specified by the 
model. The Carroll model is a figurative model not a mathematic model, 
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and as such, the components are not additive. The full Carroll model 

- ■ I 
can now be summarized as: j 

"l. Time all owed 2. Perseverance 



Degree of learning = f 



3. Aptitude 4- Quality, of instruction 
5. Ability to 'understand instruction m 



Bloom (1968) transformed Carroll's conceptual model into a working 
strategy for mastery learning. The mastery learning strategy proposed 
by Blooirwas designed for classrooms where the time allowed for learning 
was relatively fixed. Mastery was defined in terms of a specific set of 
major objectives the student was expected to exhibit by the end of a 
unit of classroom study. The content was then broken into a number of 
smaller learning units and the Unit objectives were defined where 
criterion to, mastery was essential for mastery of the major objectives. 
The instructor taught each unit using typical, group-based methods but 
supplemented this instruction with feedback-correction procedures to 
jensure that each student's unit instruction was of optimal quality. The 
feedback devices were brief review evaluations administered at unit 
completion. Each evaluation covered all objectives of a particular unit. 
Student achievement on the unit objectives indicated the level of each 
student's learning. Supplementary instructional correctives were then 
applied to help students overcome their unit learning problems before 
continuing with the group instruction. 

Since 1968, when Bloom published the mastery learning 'paradigm, a • 
number of compendi urns have been compiled surveying the efficacy of mas- 
tery learning, both nationally apd internationally (Block 1971 & 1973, 
Mitchell, in draft). Mastery learning has been implemented at many 
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levels of education, but research predominates at the college level. 
Successful strategies have also been incorporated into subjects ranging 
from, mathematics to psychology to physics (see Table 2.1). 

Research Relate d to Relevant Mastery Learning Variables 
Block (1973) indicates that there has been and continues to be a 
growing body ef research that supports the use of mastery learning pro- 
cedures across a broad spectrum of disciplines and levels. Figure 2.1 
presents a. selected summary of mastery learning research by content area 
and level that will be covered in this review. 



Table 2.1 

Summary of the Number of Mastery Learning 
Researchers by Content Areas and Level 



Level 


Math 


Science 


Psych . 


Social 
Studies 


Language 


Other 


Total 


College 


3 


3* 


3* 


2* 






11 


High 
School 












1 


1 


Junior 

High 

School 


3** 


1 




■j*** 


]** 




5 


Elementary 


3 






■j*** 


1 




5 


Total 


9 

1 


3 


3 


4 


2 


1 


22 



*The study of Moore, Mahan, and Ritts (1968) was conducted in three 
content areas. 

**The study of Kim (1968) was conducted in two content areas. 
***Tfce study of Gaines (1971) was conducted at two levels. 
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The data of Figure 1 indicates that 50 per cent of the research 
has been conducted at the college level and that approximately 50 per 
cent has been focused upon mathematics. The remaining content areas and 
levels have not received as much attention. Of the studies reviewed, , 
nine compare a mastery learning and a non-mastery learning procedure; 
three involve comparisons of mastery learning with aptitude; four include 

9 

retention; two correlate achievement with time spent in learning; and 
two report results of .nastery learning in social science disciplines. 
The review of these five sections follows. 

Comparisons of Learning b£ Mastery and Non-Mas tery Procedures 

Nine research studies have compared mastery learning to non- 
mastery learning procedures. Table 2.2 provides a summary of the 
studies. Typically, these studies report results that use data obtained 
from a final cognitive summative achievement test. / 

Airasian (1967) applied a modified version of Carroll's model of 
school learning to a class (n=33) of graduate students in test theory. 
The objective was to facilitate mastery of the conte.pt for all students 
over a ten-week period. Ungraded formative evaluations were used to 
indicate strengths and weaknesses of student learning and instruction. 
Time inventories were tallied twice a week to determine the amount of 
time spent on study. Student achievement was measured by a summative 
test. His results indicate that, whereas during the previous year 30 
per cent of the students received an A, 80 per cent of the sample, 
achieved at or above the previous year's A grade score on a parallel 
exam and thus recei ved A ' s . 

Two other results were also of interest. First, the correlation 
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between total hours of weekly study and achievement was slightly nega- 
tive. Airasian suggests^that this "may have been due to the effective- 
ness of the feedback system in leveling initial differences in prior 
exposure to the course materials. It would appear that the diagnostic 
tests, by identifying important course aims and behaviours, facilitated 
positive student use of time. Second", there was less variability oyer 
time in achievement on the formative evaluation instruments. In spite 
of the varying backgrounds possessed by the students, this strategy 
appeared to be effective in bringing most of the students to a high 
degree. of achievement by the end of the course. However, Airasian does 
not indicate how many students were repeaters from the previous year or 
whether repeating students may have provided an inflated result. 

Mayo, Hunt, and Tremmel (1968), conducted a six-week university 
summer session, fh introductory statistics that emphasized the use of 
homework and weekly formative tests accompanied by individual and small 
group assistance. Student grades were assigned by student performance 
in class rather than by relative academic standing w'fthin the class. 
Both the mid-term and summative examination were used to produce a grade. 

Seventeen students were assigned to either a mastery ""earning or a 
comparison group. The results indicate that 65 per cent of the mastery 
learning group received an A whereas only 5 per cent of the non-mastery 
group reached that standard. It was found that the feedback procedures 
(formative evaluations) and the tutoring facilitated student achievement 
in the mastery learning group. 

In a study by Moore, Mahai, and Ritts (1968), students were 
presented self-instructional materials in biology, psychology, and 
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philosophy* Students were tested at the conclusion of each unit 
(formative evaluations) and, if mastery was not achieved, they were 
redirected through additional instructional materials and alternative 
test forms until mastery was exhibited* The students were required to 
reach a predetermined achievement level that was equivalent to an A or 
B on the traditional grading system. 

Students learning biology. and psychology were divided into 
experimental and control groups (N=35 in each group). The results of 
the summative, test indicated that the experimental group achieved 
approximately one-half standard deviation above the control group. For 
students in philosophy* the grades of the experimental group were 
compared to a control group from the previous year. Approximately 80 
per cent of the experimental group received an A or B compared to 60 
per cent of the control group. These results should be treated 
carefully due to the reporting technique used. Weak research design 
and statistical analyses should not be used to make even moderate 
inferences about a treatment. This dictum appears to have been vio- 
lated in this study. 

An investigation of the effectiveness of Bloom's mastery learning 
strategy for teaching a freshmen college mathematics course was 
conducted by Collins (1969). Two algebra courses for liberal arts 
majors were used. Students were assigned to a mastery learning and a 
non-mastery group* 

The mastery learning group was given a list of course objectives 

to be covered in each unit, each class session, and each assignment. 

« / , 

During each class session, up to ten minutes was allowed to solve a 

/ 
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problem based upon the objectives from the previous session and 
assignment. The problem was then discussed and questions answered. 
Non-mastery learners received neither the objectives nor the daily 
problems. Both groups used the same textbook, covered the same 
material in class, and took the same summative test. 

In the algebra classes, 75 per cent of the mastery compared to 30 
per cent, of the non-mastery students achieved the criterion' of an A or 
B grade. In the calculus classes, 65 per cent of the mastery compared 
to 40 per cent of the non-mastery students achieved the criterion 
grades. In the mastery groups for both algebra and calculus, D and F 
grades were practically eliminated. The smaller differences in the 
percentages of students who attained the criterion under mastery and 
non-mastery learning conditions for the calculus couyses may be 

- attributed to three factors: (a) the greater importance of the courses 

i 

to all engineering and science students; (b) the higher and more 
homogeneous matnematical ability of the calculus students; and (c) the 
clearer relationship between the problems discussed in class and the 
unit test problems. 

Green (1969) used a mastery learning approach with 150 under- 
graduate students in teaching an introductory physics course. He used 
self-paced instructional units with formative evaluations, tutors, and 
programmed review materials. The purposes of the study were to 
determine if this particular mastery learning approach facilitated 
student achievement and whether student enjoyment was affected. 

The results indicated that achievement, as well as enjoyment of 
the course, was as great on the final exam as students who learned 
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unda'r the traditional lecture discussion demonstration approach. Green 
suggests that the use of student tutors rather than the use of 
technological aids added, a- personal-social dimension to student 
learning. It should be noted that no statistically significant results 
are reported. 

Kim's (1969) experiment examined the effectiveness of Bloom's 
strategies for mastery learning in Seoul, Korea where classes are 
predominantly very large (usually one teacher to 70 students). 

The research sample consisted of 272 seventh graders. Half were 
assigned to the mastery learning (experimental), group and half to the 
non-mastery learning (control) group. .These groups were comparable in 
terms of I.Q. and prior mathematics achievement. Both groups were 
taught a unit on simple geometric figures for eight sessions by their 
own teachers. 

The results indicate that 74 per cent of the experimental compared 
to only 40 per cent of the control students attained the mastery 
criterion of at least 80 per cent correct answers on the summative 
achievement test. The data also reveal an interesting relationship 
between I.Q. and achievement under mastery and non-mastery learning 
conditions. Of those with below<-average I.Q, (93), 50 per cent of the 
experimental students compared to only 8 per cent of the control 
students achieved the mastery criterion. Of. those with above-average 
•I.Q., 95 per cent of the experimental ■ students reached the criterion 
compared to only 64 per cent of the control students. Thus, almost as 
many mastery students with below-average I.Q. reached the criterion as 
control students with above-average I.Q. Mastery learning appeared 
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most effective for students with below-average I.Q. 

A mastery learning strategy for teaching introductory under- 
graduate educational psychology was reported by Biehler (1970). The 
purpose of the strategy was to reduce examination pressure and 
competition among students through frequent test reinforcement. 
Students were allowed to select a traditional or mastery learning 
treatment group. 

the mastery learning option contained a list of course objectives 
which' was produced and. circulated, to each student. The list served as 
a. basis for the construction of three normatively graded unit tests. 
Mastery performance was guaged at the cutoff for the ordinary A or B 
grade score levels. Students who failed to reach mastery performance 
reviewed the material and took an alternative test form. Three short 
papers and a term paper were also required. Final grades were assessed 
on the basis- of mastery/non-mastery on the unit test and the writing of 
acceptable papers. 

No statistical analysis of the data was attempted but through ■ 

t 

, survey reporting Biehlefr suggested that students who performed poorly 
on the Initial examination did not give up due to the procedure allow- 
ing alternative relearning procedures. These results are suspect, 
however, because of the subjective reporting, technique. 

Gentile (1970) describes a mastery approach to the teaching of a 
college course in introductory educational psychology. The purposes 
were to guarantee that all students mastered the main concepts; to 
demonstrate how 1 instruction emphasizing cooperation rather than 
competition could be organized in the classroom; and to maximize 

i 

#, 
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Interactions between • students , student proctors, and the teacher. 
-Student learning was self-paced over small instructional units. Study 
questions were provided to each student; student proctors (students who 
had already mastered the material) provided reinforcement and 
preparation for the unit test. If mastery was not achieved}- the 
student was asked to review the material and then return for retesting. 
Proctors and the instructor were available at all times to help 
students review material. Each student who mastered all the units 
received an A. 

The results of the mastery treatment were compared to a similar 
course more conventionally taught through large group, required 
lectures,; and smaller discussion group sessions. The mastery approach 
produced significantly better understanding (p<.001) of comparable 
material. taught in buth courses. On identical forms of the course 
evaluation sheet, 74 per cent of the mastery students compared to 21 
per cent of the control students indicated they enjoyed taking the 
course. 

The achievement gains in this study must be called into question. 
Gentile indicated that "comparable" material was used with- the control 
group. The failure to use the same treatment materials introduces a 
confounding variable that is difficult to control for, and hence, musj 
influence generalizations based on the results. 

In a later experiment, Kim (1970) reported the results of a large- 
scale expansion of his earlier experiment in mastery learning. Nine 
middle schools (approximately 5,800 seventh graders) in Seoul, Korea, 
participated. The experiment covered eight weeks of learning in 
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mathematics and English. 

, Instructional strategies adopted in this project were much the 
same as those used\in the first study (Kim, 1969), except that a 
diagnostic test to d\tect learning deficiencies and the necessary 
compensatory progranmed units were administered prior to the regular 
instructional sessions. \ 

The results indicate that the percentage of experimental students 
attaining mastery (80 per cent correct scores on the .final summative 
examinations) varied widely across the sample schools. On the average, 
however, 72 per cent of the^ students reached the mastery criterion by 
learning English under experimental conditions compared to only '28 per 
cent learning under standard instructional conditions. In mathematics, 
an average of 61 per cent of the mastery compared to 39 per cent of the 
non-mastery students attained the summative achievement test criterion. 
Two schools did not follow the prescribed procedures. If the results 
for these schools are ignored, then 75 per cent of the mastery students 
attained the criterion level in English and 67 per cent in mathematics.' 

. • Fluctuations from school to school in the percentage of experi- 
mental students attaining the mastery criterion appear to have been 
caused by a) variation in school learning climate, b). variations in the 
school and teacher cooperation, and c) "inefficient utilization and 
administration of the instructional materials. The school and the 
teacher are often variables that are overlooked in research. This 
experiment points up the importance of gaining full support and 
cooperation from the school, and its teaching and administrative staff. 
Naturally the findings of Kim must be interpreted carefully by 
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instruction^Mresearchers. Korea is not the United States. However, we 
cannot affurd&to dismiss his results. He has used large numbers of 
students in order to support significant cognitive gains; hence, 
statistical differences may have been due to the large number of 
subjects used in the study and not necessarily to the effects of the 
treatment. 
Summary 

The nine studies which compared mastery with non-mastery support 
the idea that mastery procedures facilitates" learning significantly more 
than non-mastery or control procedures. This finding is predictable in 
that a new or novel classroom learning mode will often find statistical 
significance in a classroom when compared to a traditional mode. 

All studies reported here were conducted in intact classrooms. 

Individual differences were minimized through a variety of class-paced 

and individualized mastery learning techniques. However, the use of the 

classroom as the unit of statistical analysis does not provide strong 

support for studying characteristics such as the individual, the teacher, 

or the classx^om^tec^^ Consequently, other independent " 

variables need~~to be isolated and evaluated. 
« 

The traditionally arranged classroom contains students who possess 
varying aptitudes to learn. An important question is whether mastery 
learning facilitates student achievement equally for all students or 
whether students with certain aptitude can benefit more from exposure to 
a mastery 'learning procedure. 
Mastery Learning and Aptitude 

Two studies hcive been located that specifically research the 
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effects of mastery learning on students of varying aptitude (Table 2.3). 
The criteria for determining aptitude range from using I.Q. scores to 
selecting advantaged and disadvantaged socio-economic populations. 

Carroll and Spearritt (1967) used 208 grade six students to 
observe relationships of intelligence and. quality of instruction ^ol 
achievement. Treatment was provided by self-instructional booklets)^ 
containing rules about verbs of an artificial language. The booklet^ 
differed in their presentation of the rules and in the. amount of 
explanation of mistakes. Form A, the high quality of instruction form, 
presented each rule, tested it before presentation of subsequent >les, 
■and referred the student to pages on which his mistakes were explained. 
Form B, the low quality of instruction form, presented a large quantity 
of disorganized information. The explanation of mistakes was also 
inadequate. Measures of learning rate, achievement, interest, and 
perseverance were administered. 

This study determined that poor quality instruction depressed the 
performance of students at all the intelligence levels. However, there 
was an interaction between intelligence end the quality of instruction 
with respect to the student's willingness to persevere on a difficult 
post-experimental task. Students in the high and low intelligence 
groups who used the structured materials .spent more time on the task 
than students in the middle intelligence group. Since, in this study, 
the average intelligence students applied thenxe^ves more to the post- 
experimental task their perseverance increased. However, tha research- 
ers speculated that poor quality of instruction decreased perseverance ^ 
for the high and low intelligence students. A further finding was that 
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learning was inefficient when students had insufficient opportunity to" 

learn, particularly where the instructional quality was poor and 

/ 

students were of low. intelligence. , / 

Kersh (1970) developed a mastery learning procedure based on the 
Carroll model and applied it to a/unit in fifth-grade arithmetic. The 
unit was taught to six ''advantaged 11 and six "disadvantaged" classes. 
This experiment has been reported in connection with the effects of a 
mastery learning procedure^ upon student, retention (see Table 2.4). 

The results of this^study indicate' that" on the same achievement 
test and using the same mastery standards, there were significant 
increases in the proportion of experimental students (mastery class) 
attaining mastery compared to the proportion of students (control 
class) who attained mastery from the previous year. These increases 
ranged for one advantaged class from 19 per cent in the 1966 control 
class to 75 per cent mastery in the 1967 mastery learning class. The 
same teachers were used in both years, moreover, a disadvantaged class 
increased from 0 per cent attaining mastery in 1966 to 20 per cent 
attaining it in the 1967 mastery learning class. This may be an 
indication that the mastery learning procedure might be helpful in at 
least partially overcoming the cumulative deficit in learning 
apparently manifested in socio-economically disadvantaged students. 
Summary 

The question is, does mastery overcome the learning difficulties 
of students with varying aptitudes? 

Aptitude is a personal quality of the learner. Therefore, it is 
questionable whether intact classes (Kersh, 1970) should be used as the 
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unit of statistical analysis. Class mastery, by the Bloom hypothesis, 
retards the high aptitude student in an effort to advance the low 
aptitude student. Consequently,, class mastery may well be contrary to 
the principle of individual instruction. 

This review of mastery learning and aptitude variation determines 
that the original question has not been firmly answered by the evidence 
provided. It would appear, therefore, that a study that manipulates 
aptitude! levels with other learning variables is required at this time. 
Mastery Learning and Retention 

Brownell (1948) refers to retention as the maintenance of skills . 
or knowledge with no practice after the completion of the learning. 
Four studies within the mastery learning paradigm have been located 
that focus on retention as a variable (see Table 2.4) • These studies 

tend to demonstrate the superiority of the mastery learning approach 

i \ 

but they are not definitive. ^ y 

Block (1970) established two ta*>ks for his study. First, a 
rationale for setting objective, criterion-referenced performance 
standards for sequential learning tasks was proposed, applied and 
validated', second, cognitive and affective consequences of requiring 
students to maintain particular mastery levels throughout the learning 
of a sequential task were examined. 

Three sequential units of elementary matrix algebra were taught to 

ninety-one eighth graders over a school week. However, due to student 

/, . ■ 

recalcitrance 17 per cent of the sample were dropped during the study. 

Many of these students were of. low aptitude. Students were randomly 

assigned to either a control treatment or one of four mastery 

00046 



33 













ai 




-m 




00 




OS 












O C 




O 












a c 




CD CD 




«K -M 




a) 


♦ 


LU DC 


CM 






a> c 


CD 


jC O 




-m a. 


S 


ZD 


03 


CO 




C CD 




i- c 




+J 




fO c 




u s_ 




•r* 03 




•a a) 




C —I 

1— 1 




CO 




0) 




•f- 




•o 




=3 




-M 




CO 



(/) 
-M 
r— 

to 



0) 
-M 

£- 

00 



4-> 
03 





0) 


0) 


r— 




CU 


E 


E 




C (0 


z: 





U 

i— CD 

0) C 3 S» 
—J 03 00 < 



•a 



-M 

00 



03 
ai 



c: o 

CD t- 

CD +-> 
S c 
+j a) 
<u +-> 

CD 

s- 

o. 



to o 
-m u. 

0) 
Q. 



ai 

r- in 
r- o> 
ro 

J-TJ 
G C 03 C 
(0 03 Q. 03 



cd 



o "U 
•r- CD 
+-> C 

03 S_ 
r- «3 

CD CD 

S-r- 

J- -M 
03 C 
CD =3 

c o 

i- E 
— 1 0) 



X) oo 

TJ >> 
CD r- 
U C 
30 
CO 
03 
CD 



CO O 
03 <f- 



$- 

(uxa 

CO CD 3 

r- O 

03 J- J- 

CD CO 
TJ -M 
CD 03 
C CD 
•r- J- 
03 C0-M 

■M (0 
CD >>03 

S-r- E 

-M I 

O.C e 

3(0 0 

o u c 

C0<f- c 
T- 03 

>> c .c 

$- C04-> 
-M CO -M 

to e 

03 O CD 



* CO TJ 
C C y E CD 

I O 0) O CD O (0 

^T- > *T~ > O 03 

0+)»r -Mf- r- JQ 
«5 O -M 03 -M 00 I 

XX CD 03 3 03 • 

■O J- E i— £ 4-» T3 3 CD 
CD -S» C 03 5 10 CD O T3 
CDOO>3CP(0S_O 

U.O^CUC0-PD(7)£ 



•M 

Of- 
CD O 

CD CO 

to 

-M 

c 

03 ♦ 

a to 

•r- </) 
<*- 03 
•r- r— 

C O 

CO 

t- CD 
(0 C 
O 

CD 

> >> 
t— r— 
-M C 
•r- O 

to 

O C 

a. i- 



TJ O 
CD ♦ «r- 
CO C -M 
03 O C 

-Mi- CD 
C $- -M 
03 CD a 
> +J S- 

T3 •r* 
(T3 J- C 

a o 

CD 

O CD CO 
> iO 
C CD 
•r- *r- TD 
J= CD 

to o c 

-M 03 T- 
C 03 
CD (/) 4-> 

•owe 

3 (0*r 
-Mr- 03 



$- 

CD 
4-> 
«+- 

03 



-M 
03 
S- 

-M ♦ 
CO A£ 

*i— 03 
C CD 

*r- $- 

^E «Q 

03 S» 
CD 
-M E 
to'E 

& 3 
-M CO 



c: -a 
ot g CD 

t- TJ O 10 

■P ST) O 03 
S- O CU.C ♦ r- X5 
CD 3 <f~ 03 2 00 i ♦ 
J= $- CD O-i— 

0- Pr Alf-'O 3 CD 
(0 W4J O > O) OT) 
CDC«r-03CDt0S-O 

1— *r- 5 J>«3 O) E 



. to o 

(0 CD -M 
>> CD C 
03 £ CD 

T3 -M 
CM CD 

tD^> $- 



CD 

■a 

03 03 
*. X J- 
DO'r- ^ 
$_ CD 
JC-M CO 

00E< 



o 



00 



o 



S- CD 03 
03 g *0 ^ 
CD E t- 03 
>> 3 r— CD 
10 O S- 
i — J= 



to c 

CD 03 
CO f > 

to c -o 

03 03 03 
r- >X5 WT3 

OT3 CD *r~ CD 
03 C0"O CO 
CM 03 03 
r- KO -M -M 



O 
CD f- 
XJ -P 
03 CD 
U B 
cox: 

-M 
SI «r- 
-M $- 



3 



O 




00047 



34 



4-> 

=3 

<n 



o 
o 

CD 
A3 



•p 

s- 
-p 

00 



o 

♦f— 

-p 

$- 
3 
O 



s- a) 

CD r— 
•Q CL 

e e 

3 C 03 
ZTt~ C/> 



-P 
U 

r- d) 

a) t-j 03 

>"OjQ 0) 
O C Z3 U 
-J 03 CO <C 



•P 
00 



$- 
03 
0) 



-P 

c c 

.0) o 

0) oo 
> 

<u • 

03 C 

03 

C 

CD tO 

a) 

5 + 

0J 

o 

C f- 
O -P 
f- c 
-P CD 
<0 -P 

r- a) 
a) s- 
s- 

$- -o 
o c 

O (0 



03 • 

-a 'i- cn 

.CD S- -P 
• C QJ c 
>jf- 4-> 0) 
i— 03 03 TJ 
0)-P E 3 

> a) -p 

t- $- a) v> 

-p t. 

o cn o >> 

CL C CL) 

cn a) >>-p 
qjdi — in 

S- 3 -P 03 
•PEE 
«P V) 03 I 
t- CJ C 

x: -p c c 
a cn cn 03 

03 <tf f- -C 

cu £ cn +> 



0) 
CL 

>> >> 

-p cn 

CD 
E -P 
O f0T3 
O S. CD 

00 CO =3 



r r cn 

C t-<*- .r- CJ 

O ' -P O i— 

i- l/> (IJ O W t- -P 

■P ^ E $- jOT3 C/) 

C CD t. 0) CL 0) t. 03 C 

a> cu a) cu cu xj 03 «p 
-p j+)-Mp- 3: «p o as 

CL) 03 03 03 03 S- >»-P 
C£ CVJ i— E O «3" r- Q.-P C/) 



C/) 



-P c/> 
•r- a 



03 

CJ 

Q) t- 

TJ -p 

<0 03 t- -P 
E »-0 c/> 

0> 0) <f- 03 «r- 

£-PO O (d 
•P 03 S~ £~ -P 

ids aaw 



5- 0) 

CL) r— * 

xj cl cn 

E CDT3 C 
o x: C -i- 

d: co 03 ^ 



o 
cn 



c 
cn o 
cut- 

-p 

U CL) 

m W 
CD Q) 

, 0 -P c 

<3 cd o 

+f CD 

?,i — cn 

U-p 03 

hcE 
. u c . 

O CD 
S-tt- c S- 
CD t- 3 
+»CCM 

co cn 03 o3 
nj t- x: CD 
Z: cn -P E 



CD CD 

S. f J- 

>>3 C >> 3 

S-TJ O J-"0 

§0 CD C CD CD 

O -P O 

o cn o "O cn o 

r— 03 U C 03 
CQ E CL 03 E CL 



00 



oo 



O CD 
Or- W 

xr«r- o 
ax>T- 
oo o c 

E 03 

x: o x: 
o>-p o 

i- 3 0) 



cn 
c 



-P 
c 

CD 



O 

cn 



.ERIC 



00048 



35 



treatments. The control group learned algebra at their own pace with 
no criterion level required, but fhe 'mastery treatment groups were 



required to exhibit criterion performance on one unit before proceeding 
to the next. Each of the mastery groups were required to learn a 
different percentage of the material - either 65, 75, 85, or 95 per 
cent. 

The findings indicate that there was a linear relationship between 
the percentage of material Teamed per unit and student retention as 
measured on a parallel -form of the summative achievement tast 
administered two weeks after the close of instruction, that is, the 
higher the level to which each unit was learned, the greater the 
retention. However, only those students learning to the 85 and 95 per 
cent criteripn retained the algebra to a significantly greater extent 
than the non-mastery treatment group. These results must be viewed 
tentatively as the number of students in the treatment groups was 
small, thus limiting the scope of' the study. Moreover, it is difficult 
to ascertain when a retentir measure should be administered ,to measure 
.retention effectively. , 

In Kersh's (1970) study, six classes of fifth-grade students from 
socio-economically advantaged backgrounds and six classes from socio- 
economically disadvantaged background's were taught arithmetic by their 
regular teacher over a full school year. The mid-year and end of year 
performance of these students was then compared with the mid-year and 
end of year performance of equivalent classes from the previous year. 
Further, students in the experimental classes were retested with a 
/parallel form of the final exam at the beginning of the sixth grade. 
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The results\of this study are equivocal^ Because of teacher 
inability to follow the experimental procedures, only one class pro- 
duced significant gains in achievement and retention. Sixty-three per 
cent of the mastery learning students still achieved to the ,80 per cent 
criterion on the retention test administered at the beginning of grade 
six. 

Kersh's study may be suspect due to the researcher's inability to 
control the teacher factor. 'This point reinforces a similar point made 
concerning Kim's (1970) study, that is, the necessity to control 
contextual variables especially classroom variables, 

Romberg, Shepler, and King (1970) have reported results similar to 
both Block and Kersh. They had previously taught sixth grade students 
one unit of mathematical proof and another unit of probability and 
statistics. Students were expected,,. to learn to a 90 per cent criterion 
level. Two weeks after the end of instruction, the students were given 
a delayed posttest using the identical form of the unit final 
examination. 

Romberg and his colleagues found that the correlation between 
achievement and retention was -75 and .78 for the proof and probability 
units, respectively. The individual retention ratios, i.e., amount 
retained/amount learned, wer^approximately .95 for both units. It was 
also found that the mastery learning students exhibited significantly 
greater retention of the material learned than a matched group of non- 
mastery learning students. 

In evaluating these results, it must be remembered that the same 
items appeared on the posttest and the retention measure. If students 

/ 
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were given feedback about their posttest performance, then this feedback 
might have inflated. the retention results. \ 

In an unusual experiment, Wentling (In Press)\taught a group of 
high school students a unit on automobile ignition systems as part of a 
course in automobile mechanics. Half of the students learned under a 
mastery learning strategy and half learned under a non-masWy learning 
strategy. The two types of instructional strategy were thenVrossed 
with two levels of intelligence and three feedback conditionsA The data 
indicated that the mastery treatments yielded significantly greater 
scores than the non-mastery treatments on a retention measure admini- 
stered after an undisclosed time at the end of instruction. 

A flaw in this study occurred when Wentling used a feedback 
condition under both types of instruction which was probably an error. 
Accordingly, the retention data reported within the cells were 
confounded, which may have produced spurious retention results. 

Summary ( - 

The mastery studies, reported for the dependent variable, retention 
have all been conducted in a Bloom class-paced situation. No retention 
studies have been found that have used the individual as the unit of 
analysis. Block (1973) concedes that more definitive research is 
required to determine whether individuals who learned to mastery 
retained more material than individuals who learned under non-mastery 
conditions. 

r 

Mastery Learning and Times- to- Testing 

" Few studies have examined the decremental and incremental effects 
of .a mastery learning procedure upon time spent in learning (Table 2.5). 
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Merrill, Barton, and Wood (1970) pursued this line of research when 
they examined the effectiveness of a procedure to facilitate student 
learning of a hierarchical learning task. It was proposed that 
specific review at each stage where difficulties were encountered in 
student learning of a task should facilitate learning. at subsequent c 
stages. Forty college students were randomly assigned to two groups to 
learn an imaginary science through a five-lesson teaching machine 
course. In the experimental group, a specific review, step-by-step' 
explanation was employed to facilitate learning of mislearned material. 
The control group did not receive specific review. In both groups, each 
lesson was followed by a quiz with no feedback of results. Immediately 
following the five lessons and quizzes each student was administered a 
criterion test. 

The findings indicate that specific review following difficulties 
made experimental student learning increasingly efficient. The total 
time spent on original learning by the experimental group decreased 
successively across the five lessons. Further, the total time spent by 
the experimental group to complete the five lessons and accompanying 
quizzes, including the specific review material , was slightly less than 
the time spent by the control group. In other words, the experimental 
students sti/died more material than the control students but took less 
total time to learn it. 

Support for the conclusion of Merrill, Barton, and Wood was 
provided by the results of Block's (1970) experiment (see Mastery 
Learning and Retention). In this study, the average total amount of 
learning time spent by each group was broken into the time spent in 
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textbook learning and the time spent ini correction/review. Attention 
was focused on the time spent by each group in original, textbook 
learning. The data revealed that as student- learning progressed from 
unit to unit, students who maintained the 95 per cent level spent less 
time in original learning than students who maintained the other 
levels. This was especially apparent by unit three. Students in the 
95 per cent group spent approximately the same average learning time as 
the control group. Hence, even if learning efficiency was measured in 
terms of per unit time rather than total learning time, the maintenance 
of none of the required levels made student learning more efficient 
than it might have been. 

However, when learning efficiency was defined as the ratio of the 
average amount of original learning per unit to the average amount of • 

learning time per unit, then N the maintenance of the 95 per cent level , 

\ 

made pupil learning more efficient by unit three. Students required to 

i 

V 

maintain the 95 per cent level learned approximately 40 per cent more 
material from textbook unit three than the control students, but they 
spent roughly the same amount of time in original learning as the 
control students. 

It should be noted that in both of these studies elapsed time 
was recorded rather than the time spent in actual learning. Stu- 
dents rarely utilize the complete amount of time allowed for each 
subject. 
Summary 

Both reported studies used class-paced sequential abstract content 
over a short learning period of time of five days. Students 

ERIC 00054 



! 41 

particularly the slower learners, did not work-on-taslT and possibly 
experience the accompanying frustration that a longer period of time 
may have imposed. Therefore,! there is a strong need for a study that 
provides Individual students, of varying aptitude, with sufficient time 
to attempt a structured learning task that is longer than five days. 
This would also provide an opportunity to observe the effects of the 
procedures over time using an individual -paced procedure rather than a 
class-paced procedure. % - . 

Mastery Learning and the Social Sciences 

Most mastery learning research has focused upon those disciplines 
that lend themselves to sequencing and hierarchical arrangement. Math 
and science have been well represented. However, there has been and 
ontlnues to be a notable lack of research dealing with disciplines 
falling under the rubric of social science. The compartmentalizing of 
subject matter materials of the social- science disciplines is not so 
readily possible as it is in mathematics or science (see Table 2.1). 
However, two studies that use social science disciplines in a mastery 
context were located (Table 2.6). 

Gaines (1971) conducted a study with students frc-i the fifth, 
sixth, seventh, and eighth grades using Georgia Anthropology Curriculum 
materials. The purpose of his study was to test presumed relationships 
of certain variables in John B. Carroll's model of school learning 
using two mastery learning strategies. Achievement, Interaction. - 
between quality of instruction and ability to understand instruction, 
and the correlation of ability to understand instruction and degree of 
learning for both strategies were the specific variables under 
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consideration. 

While Gaines theoretically outlined two different treatment 
strategies - a mastery strategy of formative, multiple choice tests, 1 
and the non-mastery strategy of a workbook with completion items - thfi 
actual differences between the strategies may not have been sufficient 
to produce significant achievement differences. 

Gaines also encountered inconsistencies with the administration of 
the treatment materials in schools. However, the eighth grade 
comparison favored mastery treatment using formative tests and was 
found significant. 

Tierney's (1973) study involved two comparisons. First, was the 
comparison of feedback/correction components of two mastery learning 
strategies and'a traditional lecture-discussion strategy to determine 
whether they produced significantly greater student achievement and 
attitude toward learning than a traditional mode. Second, was the 
comparison of an alternative instructional mode and the redirection of 
students into an original stimulus mode. His sample consisted of 
forty-five volunteer college students enrolled in an upper division 
European History class. 

The study found no significant differences for either the 
achievement or affective criterions on the first comparison. However, 
significant differences were indicated between the two different 
mastery correction procedures on the application section of the 
achievement criterion. The alternative instruction mode produced 
students more able to apply the course material than those students who 
were redirected to original learning material. 
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Summary 

Mastery learning research in the areas of social science has been 
minimal. The two studies^reported suggest that application of a , 
mastery strategy to the-social science disciplines is in the formative 
stages. Consequently, there is a real need for a systematic appraisal 
of learning and contextual variables associated with mastery learning 
as applied to the social science disciplines. 

Conclusion 

Mastery learning has many roots in earlier pedagogy. These 
antecedents assisted Bloom to conceptualize what has now been termed 
•mastery learning'. Since 1968, when Bloom coined the phrase, there 
hasjbeen little systematic- study of variables associated with mastery 
learning. 

Empirical studies, comparing^ mastery to a non-mastery procedure, 
predictably, show support across a^ide^range' of content areas. 
However, there has been no systematic attempt to determine whether slow 
learning students benefit from constant correction and feedback or 
whether, as Bloom claims, mastery can induce learning for nearly all 
students, particularly, when students areMn an individual-paced 
situation with sufficient time to complete each learning unit. 

A measure of effective learning is retention. The length of time 
between end-of-instruction and administration of the retention measure 
would seem to be important. In two of the reported ^tudies two weeks 
intervened, one did not report, while the fourth tested^ after a summer 
break; A retention measure administered during the same\academic year, 
and with a longer intervening time interval could lend stronger support 
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to the mastery paradigm, particularly if slow learning students are 
involved. 

The studies using time show that more efficient and economical use 
of school time might be expected. However, duration of the treatment 
may be a vital factor. Both studies ran for five days. Consequently, 
the question of whether a mastery procedure can save learning time or 
increase the amount covered or learned is not resolved. 

Mastery learning research has been haphazard in its design and 
approach to independent and dependent variables. What is required is a 
systematic appraisal of the learning and contextual variables. This 
study -is the first of a series that are planned to manipulate selected 
independent and dependent variables. The independent variables to be 
used in this study are aptitude, treatment, and class while the 
dependent variables are learning, retention, and times-t.o-testing. 
Social science materials have received scant attention from researchers. 
The Use of geography material in this study within the mastery context, 
answers the call from Gaines (1971) and Tierney (1973) for application 
of a mastery procedure to disciplines of the social sciences. 

The next chapter reviews the general methodologies and specific 
procedures used in developing the materials. The materials were used 
to test the questions raised in discussions from this and earlier 
chapters. 
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* Chapter III 
Development of Materials Used in the Stucty 

In the present study, the development of the treatment materials, 
the treatment procedure, and the testing instruments were of. primary 
importance. This chapter describes the development of four elements: 
1) curriculum materials; 2) treatment procedures; 3) characteristics 
and construction of the geography achievement test; and 4) alterations 
in treatment procedures. 

Construction of the Curriculum Materials 

Treatment preparation for the experiment consisted of the^ 
development of the unit Functions of Cities (Jones, 1974). The same 
student text was developed for both the mastery __(Jjj)_ and the non- 
mastery (T2) learning groups. The mastery workbook differed in amount 
of correction and feedback; content and workbook exercises were 
identical in the mastery and non-mastery workbooks. 
Text Content 

The text Functions of Cities consisted of nine chapters, as listed 
in Table 3.1. Chapter 1 "Economic Base and Function" introduced the 
two main generalizations, "function" and "economic base." Function was 
defined and illustrated in terms of the relation of the city to the 
economy of the country; economic base was defined and illustrated in 
terms of the way people in a city depended upon the most important 
economic activities for their livelihood. The introductory chapter 
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also gave an overview of the eight cities and their functions and was 

I* 

designed to serve as an advance organizer to the unit (Ausubel, 1963). 
The next eight chapters then gave a descriptive and analytical 
presentation of eight cities in terms of their salient economic 
functions. The text concluded with a glossary. 

Table 3.1 

Cities by Geographic Location and Function, 
Text. Functions of Cities , Jones; 1974 
Table of Contents , 



Chapter 


City 


Country 


Continent - 


Function 


1 


Introduction 




i 




2 


Durban 


South Africa 


Africa 


Port 


, 3 


Frankfurt 


West 'Germany 


Europe 


Commerce 


4 . 


Pittsburgh 


United States 


North America 


Industry 


5 


Brasilia 


Brazil 


South America 


Government 


6 


Surfers Paradise 


Australia 


Australia 


Resort 


7 


Benares 


India 


Asia 


Religion 


8 


Mexico City 


Mexico 


South America 


Dominant City 


9 


Tokyo 


Japan 


Asia 


Super City 



The eight cities were selected to provide type illustrations of 
function and to give geographic coverage of all the continents, except 
Antarctica. Europe and North America were underrepresented in terms of * 
geographic coverage. However, each city served as an example of the 
function of a city with certain economic characteristics. Thus, while 
Durban was selected as an example of a port city, other characteristic 
port cities, such as New Orleans, Rotterdam, or Fremantle, might have 
been selected. 

Other considerations than type criteria entered into city 
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selection. The recency of the development of Brazilia as well as its 
modern planning and buildings were influential in the selection of this 
city as a government type. Mexico City was selected as the dominant 
city not only because of its commanding position within the economic and 
political life in Mexico, but also because of the interest the Geography 
Curriculum Project has in- Mexico as a potential area of fieltf study in 
connection with the new Latin American Studies Program in the Department 
of Social Science Education. Tokyo was selected as an example of a 
super city not only because of its function in the world economy, but 
because it serves as an example of a non-western city achieving 
international prominence. 

Categorization of cities by a particular function was not used as 
a device to restrict discussion of the interrelationship of economic 
activities. Each chapter attempted to show that while a city might be 
categorized by a function, with a principal economic base, economic 
activities interact. The writer believed that this method of presenta- 
tion not only had the merit of contributing geographic diversity to the 
presentation, but also permitted an intensive development of the 
conceptual economic base that relates to the modern urban environment. 
The workbook required the student in most cases to apply the knowledge 
of one type of city to another similar city which has not been studied. 
Therefore, the text and workbook together provided a basis for a clearer 
understanding of world urban economics. 
Chapter Format x 

The format for the eight city chapters followed a structured 
presentation. The basic format was: 
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Part 1 Part 2 Part 3 

Organizer: Application Introduction of the Narrative on specific, 

-) -a 
of generalizations. Use functional city. unique features of 

of a map of the country: [Use of a city map. functional city. Use of 

pictures. 

Part 4 
* Summary . 

Part 1. This section acted as an advanced organizer for the specific 
concepts and facts\hat followed in Parts 2 and 3. The two major 
generalizations 'function 1 and 'economic base' were used within the 
context of the functional city. A map pf the country, locating the 
city, was used. 

Part 2. Each selected city was discussed in general terms to provide 
an overview of the city and to identify specific pertinent character- 
istics of the city. A map of the city, locating many of the specific 
pertinent characteristics, was provided. 

Part 3. Each characteristic was developed to provide the student with 
an examination of the economic forces within the city type. Pictures 
were used to supplement the narrative. 

Part 4. This section provided a succinct summary statement concerning 
the narrative of the study. Appendix A contains a sample copy of the 
student text. 

The previous sections have described the content of the textbook 
for the treatment unit. The next section will discuss the workbook 
content and format. 



ERLC 



00063 



50 

Workbook Content 

The content in the workbooks was the same for both treatment 
groups. Each chapter in the student text had a parrallel chapter in the 
workbook. The various activities that the students were required to 
complete were premised upon the reading and study of material appearing 
in each chapter of the student text. The learning outcomes expected 

\ 

from each set of workbook exercises was dependent upon the type of 
activity that had to be completed. Activities within each chapter of the 
workbook ranged from recall of knowledge and facts to generalizations and 
applications of concepts. These activities were written at the knowledge 
and application levels of the Taxonomy of Educational Objectives: 
Handbook I. Cognitive Domain (Bloom, Englehart, Furst, and Hill; 1956). 
Questions also focused upon the maps and pictures appearing in the 
student text. Consequently, students were expected to display 
observation and map reading skills as well as stated learning skills. 

Activities in. the workbook were presented in a variety of forms. 
Workbook Format 

The workbook aided the student to learn new knowledge about the 
functions of cities. It also provided the student with practice in 
using the knowledge learned. Practice was provided through the 
activities that were available in each of the chapters. Each chapter 
contained a combination of the following activities: 

1. Main Words 

2. I can match words with definitions. 

3. I can write a definition for each main word. I 

4. I can match an example or illustration of the main words. 
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5. I can write an example or an illustration of the main words. 

6. I can do or explain activities. 

7. Thought Questions 

8. Review Test(s) ■ 

' The consistent use of this format provided students with a greater 
opportunity to operate with the content 'and knowledge required by each 
'. of the activities. Less time had to be spent by the students to 
t decipher how they were to perform the learning task within the scope of 
the treatment procedures. The only difference between the workbook for 
the mastery groups was the inclusion of an extra review test. This 
difference is discussed in the next section. 

Treatment Procedures 
Two treatment procedures were employed in this study. There was a 
mastery (Ti) and a non-mastery (Tg) learning procedure. As the 
'materials used in this study were the same for both treatment groups, the 
focus of the study was on the manipulation of various components within 
the mastery treatment procedure. The two treatment procedures were 
conceptualized in the following format: 
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Treatment 1 
(Mastery) 



X 
X 



Treatment 2 
(Non-Mastery) 



X 
X 



Presentation 

Narrative 

Student 

Workbook 

Activities 

Diagnosis 

Review Test 
One 



Correction X 0 

Feedback X 0 

Remediation 

Prescriptive X 0 

Review 

Specific X '0 

Practice 

General X 0 

Review 

Diagnos is 

Review Test X 0 

Two 

Correction X 0 | 

Summative Test 

(Administered to all \ x x 

students at the \ 
conclusion of the unit) \ 

\ 

Weekly Class Discussion X • X 

The X's indicate the components that were used in the procedure 

while the O's indicate those components not used. 

In order to learn material in the text, the workbook provided 
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mastery and non-mastery procedures. 
Procedures Common to Both Treatments 

All students regardless of treatment had to follow steps 1-9: 
Appendix B is a sample copy of the non-mastery (T2) workbook. 

1. Fill in the time log at the beginning of each chapter with 
time work begun and the date. 

2. Read one chapter of the student text. Begin with Chapter One. 

3. When ready open the workbook, close the text, and work through 
the activities. 

4. When the activities were completed the student turned to the 1 
answer sheets at the back of the workbook and corrected his work. 

5. If any activities were incorrect the student re-read the text 
and then did the activities over. 

6. When the student was reacty the student indicated to the 
teacher readiness to take a review test. 

7. The review test was self- administered. 

8. The student corrected the review test from the answer sheets 
at the back of the workbook. As each review test contained 20 items a 
score out of 20 was recorded. 

9. -The classroom teacher checked the results of the test and non- 
mastery students completed the time log with the ending time and the 
number of minutes worked, flon-mastery (T2) learning students the pro- 
ceeded to the next chapter and followed the same procedures. However, 
mastery (T|) learning students were required to perform remedial 
learning tasks., 
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Additional Procedures for Mastery Learning 

° Procedures 10-20 applied only to students in the mastery treatment: 
(See Appendix C for a sample copy of the mastery workbook). 

10. The criterion level arbitrarily selected for this study, but 
supported by the studies of Block (1970) and Kim (1969), was 85 per cent. 
This meant that on a review test of 20 items students needed 17 items 
correct to reach the minimum criterion level. If a student got 17 out 
of 20 items correct or better the student proceeded to the next chapter 
and repeated the same procedures. 

11. If a student got less than 17 items correct out of 20 the 
student looked at the incorrect items on the test. Each item contained 
a key beside it. e.g. 1.2A. The 1.2 refers to chapter one, page 2 in 
the student text, and the A refers to the specific paragraph on that 
page. This paragraph contained the correct ansv/er to the question. 

12. Students were directed to re-read the paragraph in the text 
for the incorrect test item. 

13. Students were directed to correct incorrect workbook items. 

14. When all incorrect items had been corrected, the student was 
directed to review all the work in both text and workbook. 

15. Students then informed their teacher of their readiness to 
take a second review test. 

16. Students then self-administered the second review test. 

17. Students corrected the second review test from the answer 
sheets at the back of the workbook. 

18. The classroom teacher was then presented with the completed 
second review test, who recorded whether mastery had been reached or not. 
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19. Students then completed the time log for the chapter. 

20. Mastery students, then proceeded to the next chapter, whether 
or not they had achieved mastery. 

The above workbook procedures were designed so that the student 
could study and learn without direct instruction by the teacher. 
Therefore, students were responsible for the learning tasks, correction 
to find errors, and the' remedial stages of the learning procedure. 

The weekly class discussion deserves special comment. During the 
initial administration of the materials to the students, teachers were 
instructed to provide students with a break from the self-instructional 
mode every three days. Both the researcher and his major porfessor felt 
that this might alleviate such problems as boredom and work fatigue that 
appeared in other self-instructional studies (Dumbleton, 1973; Pelletti, 
1973). After a week of instruction with the materials, however, the 
three-day discussion was waived in favour of having a weekly class 
discussion on Wednesday during the middle of the school week. All 
teachers reported that this was a more satisfactory arrangement. Both 
students and teachers reported at the conclusion of treatment that the 
weekly class discussion was a major contribution to maintain student 
interest and perseverance* 
Readability 

Functions of Cities was written for students in the middle grades. 
The most appropriate way to determine whether the materials were 
satisfactory tor this age level would have been to administer them 
across a broad cross-section of levels. However, the limited resources 
of the researcher precluded the use of this approach and consequently 
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did not allow a complete evaluation of the materials. A second methbd is 
to establish readability by using a standardized reading formula. The 
three most commonly used- readability formulas are Dale-Chall, SnacheJ 
and Flesch. Ip this study the Rudolf Flesch (1949) formula for 
readability was applied. It was selected ^rimarTTy because it ' is more 
appropriate for materials used with upper Elementary and Junior High 
School materials. Second, the Flesch formula does not rely upon a 
specific word list (e.g. Dale-Chall) which can get out of date ^(Powers, - 
Sumner, and Kearl, 1958). 

The Flesch formula requires that a number of steps be followed. 
First, the number of words must -be counted per sentence. 

(a) Count as a sentence each unit of thought that is grammatically 
independent of another sentence or clause. Its end may be marked by a 
period, question mark, exclamation point, semi-colon, or colon. Also 
count a fragment as a sentence. 

(b) To do this, count the words in ten sentences separately, add, 
then divide bitten. 

Second, count the syllables in 100 words. When these tasks have 
been completed. The following ^arithmetic operations should be conducted: 

(a) Multiply the average sentence by 1.015 

(b) Multiply the number of syllables in 100 v/ords by .846 

(c) Add (a) and (b) 

(d) Subtract this sura from 206.835 

(e) This provides the Reading Ease Score 

The Reading Ease Score is then tested against a table which 
provides information concerning the readability level of materials. In 



ERIC 



\ * 00070 



v. 



* t 57 

order to, obtain a readability level for Functions of Cities , two samples 
were selected from Chapters 2-9. Table 3.2 shows the Reading Ease 
Scores obtained for the samples. 

The mean of the Reading Ease Scores was 85. When this score was 
tested against the readability scale it was found that the materials had 
a readability level of Grade Six. ■ 
Construction and Characteristics of_ the Review and 



Sunmative Achievement Tests 
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This section describes the construction and characteristics of the 
review tests and the summative test (Geography Achievement Test), along 
Kith the methods used to establish validity and reliability. /Nine 
review tests and one summative test were constructed. 

/ 

Construction of the Review Tests / 

* / 
Each chapter in the workbook Functions of Cities contained a 

review test. A review test measured the amount of learning in a chapter. 
Each review test contained 20 items. Items were written /in three forms: 
1) three foil, multiple choi,ce; 2) true or false; and 3/) completion. 
Th'ere was no consistent number of items in each form./ Some chapters 
contained more multiple choice items while others contained more true or 
false, or completion items. / 

All review items were written strictly on the content in the 
student text. Each test item was keyed to a particular paragraph within 
the chapter. Items tested recall, application, and transfer of cogni- 
tive knowledge. 

The nori-mastery (T 2 ) treatment contained one review test at the 
conclusion of each chapter. Review tests used in both treatments were 
exactly the same. The mastery (Ti) treatment contained two^Veview 

00071 \ 



1 

58 

Table 3.2 

Reading Ease Scores and Grade Level for 17 Samples Selected from 
Nine Chapters of the Materials Functions of Cities 





V 






Chapter 


Sample 


Reading Ease Scores 


Grade Level 


1 


1 


90 


• 5 


2 


2 


93 


5 




3 


97 


5 


v 3 


4 


80 


6 




5 


85 


6 


4 


6 


89 ' 


6 




7 


83 


6 


5 


8 


63 • 


8 or 9 






93 


5 


6 


10 


93 


5 




11 


89 


6 


7 


12 


67 


8 or 9 




13 


100 


4 


8 


14 


93 


5 




15 


86 


6 


9 


16 


68 


8 or 9 




17 


73 


7 



tests. The second review test for the mastery (T-|) treatment did not 
contain new test items; the items used in the first review test were 
merely reordered. The review tests can be seen in the workbooks in 
Appendices B and C. 
Content Validity 

Items were constructed on the premise that the text contained the 
knowledge necessary to answer the questions. The researcher constructed 
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.the items then keyed them to a particular paragraph in the chapter. The 
researcher's major professor Dr. Marion J. Rice and graduate students 
James S. Fagan and Robert R. Myers then checked the text and the test 

items to establish that the test items measured the knowledge' conveyed 

j 

in the text. Changes were made at their suggestion to reduce ambigui- 
ties, improve form, and simplify language. 

Knowledge and application items were constructed/in accordance 
with the Taxonomy of Educational Objectives: Cognitive Domain (Bloom, 
et. a!., 1956) to measure student learning on each of the review tests^ 
Reliability 

Within the context of the mastery learning procedure, the concept 
of reliability applied to the review tests was considered inappropriate. 
The review tests were criterion referenced tests, not norm referenced 
tests. A criterion referenced test requires that students perform to an 
arbitrarily selected criterion level. This study used the 85 per cent 
achievement level on a review test as the criterion level . A norm 
reference test is used to determine an achievement score for individual 
students. The scores vary from student to student. The scores have a 
range. Because students were expected to achieve to a criterion level, 
and hence there was little score variance, no reliability measures were 
obtained for the review tests (Gronlund, 1973: Popham and Husek, 1971). 
However, the summative test was treated differently. 
Construction of the Summative Test 

The final version of the summative test was in two parts. The 
first part was a 40 item, four option, multiple choice test while the 
second part was a 24 item, retrieval chart completion test. The total 
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64 item test was designed to measure the students 1 knowledge of facts, 
concepts, and generalizations presented in the treatment unit. The 
procedures followed in constructing this test are outlined below: 

1. The major facts, concepts, and generalizations to be learned 
were identified in the treatment unit. (Se£ Appendix D, p. 462 for a 
list.) 

2. A table of specifications was drawn up and each content item 
was categorized for inclusion in the table. (See Appendix E, p. 464 for 
table of specifications.) 

3. A 40 item, four option, multiple choice was constructed. This 

task was simplified because items for the major facts, concepts, and 

» 

generalizations had been previously constructed for the review tests. 
However, an extra option was added to each of the items selected and in 
many cases the items were rewritten and reworded. A 24 item retrieval 
chart was also constructed. A clue was provided for each of the eight 
cities that were studied in the treatment unit and three extra pieces of 
information were needed to fill in the blanks. (See Appendix F, p. 467 
for the 64 item test.) 

4. Dr. Marion J. Rice and the researcher were solely responsible 
for the writing and selection of the final summative test. No other 
people were as familiar with the cot/tent of the unit and the student 
learning outcomes that were expected. The items selected were not only 
appropriate to the content, but also appeared to displa^rxlarity, 
understandability, and accuracy. 

Content Validity 

Due to the process described above it was believed, by the 



ERIC 00074 



61 

j 

researcher, that no rules associated with content validity had been 
violated. Consequently, it was assumed that the summattve test met the 
criteria for content validity. 
Reliability 

The summative test was a norm referenced test. Students responded 
to items to the best of their knowledge and the scores from the summative 
test were used as data for purposes of statistical analysis. Unfortuna- 
tely, due to the press of time no pilot testing of the measuring 
instrument was conducted. Instead the following procedures were 
followed: ^ 

1. The 64 item summative test was administered to the treatment 
groups . 

2. An arbitrary decision was reached by the researcher to select 
from one of the treatment groups three classes that displayed the widest 
range of scores as measured by the class mean on the measure used as the 
blocking variable the Iowa Tests of Basic Skills word meaning section. 

3. The class scoring the highest mean and the class scoring the 
lowest mean both fell in the mastery treatment group. Both these 
classes were selected. Another class in the mastery treatment group 
closest to the mean was also selected. In all 76 students were used. 



4. Student responses on the 40 item, four option, multipl 



e choice 



test were transposed to IBM sheets. These scores were then analyzed by 
the Analysis of Item and Test Homogeneity (ANLITH) computer program. 

Table 3.3 summarizes the ANLITH results. 

\ 



ERiC 00075 



62 



Table 3.3 

Test Analysis Data for the 40 Item Multiple Choice Instrument 



Grade 


Number of 
Students 


Number of 
Questions 


Estimate of 
Reliability 


Mean 


S.D. 


S.E. of 
Measurement 


7 

(Three 
Classes) 


76 


40 


• 89 


21.40 


8.34 


2.76 



5. The results of the ANLITH indicated that the 40 item multiple 

I 

choice te ; st had a reliability of .89. 

6. Item difficulty was examined for each item on the test. Two 
items (number 12, 36) had high difficulty (under 30 per cent scored 
correctly) while four items (numbers 16, 18, 26, 28) had low difficulty 

(over 70 per cent scored correctly). As the test had already been 

z 

administered no changes were made. '(See Appendix F, p. 467 for Jthe 40 
item test.) 

Table 3.4 

Test Analysis Data for the 24 Item Recall Instrument 



Grade 


Number of 
Students 


Number of 
Questions 


Estimate of 
Reliability 


Mean 


S.D. 


S.E. of 
Measurement 


7 

(Three 
Classes) 


69 


24 


.95 


13.00 


7.69 


1.77 



7. A reliability analysis was conducted with the 24 item recall v 
test. The results of the ANLITH indicated that the 24 item recall test 
had a reliability of .95. 
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8. As the test h^d already been 'administered no changes were 
made. However, the low standard error of measurement of 1.77 was an 
indication that this was a test that could measure individual's 

^ knowledge of recall. (See Appendix F, p. 467 , for the 24 item recall 
test.). Tables 3.5 and 3.6 show the mean scores and percentages of the 
40 item and 24 item posttest measure by treatment and aptitude. These 

, v tables show that there remained a range of scores consistent with the 
aptitude levels of the students used in the present study. High 
aptitude students achieved higher than middle and low aptitude students, 
as did middle aptitude students achieve higher than low aptitude 
students. 

Alterations in the Treatment Procedures 
Most experimental studies that use classroom learning materials 
conduct a pilot test of the materials and the treatment procedures. 
This study did not employ a pilot test phase because the time remaining 
in the school year after material development and duplication did not 
allow for a pilot run. It was therefore necessary to test the mastery 
procedure without a pilot trial. Detailed and careful procedures 
described previously had been derived for the mastery and non-mastery 
treatments . \ 

During the first week of instruction teachers were requested to 
monitor the treatment procedures carefully and to observe the reaction 
of students. A Report From Teachers form was provided to each teacher. 
See Appendix G, p. 479 ior a copy of the Report from Teachers.) 
Questions concerning the treatment procedures, content, and student and 
teacher reactions were included on the report. Teachers indicated that 
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the procedures were working satisfactorily. However, they did indicate 
that a change from a three day break to a mid-week break might improve 
student learning and maintain interest. This change was instituted. 
The same report was given to the teachers after each succeeding week of 
instruction. All reports were satisfactory. 

The content of the text and the workbook had already been set. 
Consequently, even if satisfaction had not been expressed by the 
teachers there was very little that could have been done to alter the 
content. All teachers, however, expressed concern for the reading leveL 
of the materials. It should be noted that the mean reading level of the 
population of students used in this study was 6,6 while the reading 
level of the materials was Grade Six, The researcher decided that this 
criticism was not germane in the light of these statistics, 

A pilot test is an important part of an experimehtal study from 
the standpoint of both test construction and treatment procedures. 
However, it was not possible to conduct such a pilot test study. While 
certain precautions were built into the actual administration the 
researcher acknowledges that the absence of pilot testing is a 
limitation of the study. 

Summary 

This chapter outlined the development of the curriculum materials 
and measuring instruments used in the study. The curriculum materials 
described the economic base and function of selected cities around the 
world. The t ment units used in the study were constructed in two 
formats. Treatment 1 was a mastery learning procedure while treatment 2 
was non-mastery learning procedure. 

00080 



j 

67 

The remainder of the chapter described the format and construction 
of the review tests and the summative geography achievement test. 
Finally, an explanation was offered for the omission of the pilot test. 
However, a number of precautions were described that should have offset 
the disadvantages of the lack of the pilot test. The researcher 
acknowledged that this was a limitation to the study. 

The next chapter will describe the research design and the 
statistical procedures used to analyze the data. 
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CHAPTER IV 
METHODOLOGIES AND PROCEDURES 

This chapter describes the following six elements of the study: 
1) experimental design; 2) experimental study; 3) pattern of logic used 
in the study; 4) contextual variables; 5) statistical procedures; and 
6) limitations. / 

EXPERIMENTAL DESIGN / 

A 3 x 10 x 2, aptitude by classes-nested-within-treafcnents, by 
treatments, multivariate analysis of variance (MANOVA) usying three 
measures of effect was employed with the posttest data of this study. 
This design is shown in Table 4.1. 
Rationale for the Design 

This design was used in order to counter the main disadvantage 
of completely randomized designs— their relative inefficiency. The 
error term, against which the variability of treatment means is tested, 
is generally large in randomized designs. This large error term results 
from the variability among subjects within groups. Much of the error 
variance arises from individual differences in factors which effect per- 
formance. The blocking design is one method of removing some of the 
error variance due to individual differences (Myers, 1966). 

Blocking has four advantages. First, the treatment groups are 
roughly matched on a measure which should affect performance. Second, 
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the interaction effects can La studied v -Third, the blocking design 
will usually be more efficient than a "one factor design involving the 
same total number of dependent measures at each treatment level (Kyers, 
1966). Four, the blocking design allowed the researcher to observe the 
efficiency of the mastery procedures with students of varying aptitude, 
particularly low aptitude students. However, it was not a practical 
possibility to rearrange student seating into aptitude levels ^'n. each 
of the classes to minimize across aptitude level interaction, j Students 
within each class sat at their normal work desk. No attempt was made 
by the researcher to coptrol far across-aisle or wi thin-aisle student 
communication even though the self-instruction materials were designed 
to minimize student interaction. 

This design also involved the use of two posttest treatment 
groups. There were several reasons why a posttest-only*, rather than 
a pretest-posttest design was used. As Campbell and Stanley (1963) have 
pointed out, the pretest of initial differences is not essential in 
experimental designs. The randomization of students to the two treat- 
ment groups controlled for initial systematic biases. Since randomiza- 
tion controlled for systematic initial biases, it was assumed that the 
achievement scores of the two treatment groups would have exhibited 
only chance differences from each other on a pretest. • 

A cognitive pretest was alsoVejected. Campbell and Stanley 
•(^963) have indicated that a pretest of new subject matter is inappro- 
priate. Greene (196?), Thomas (1967), and Walsh (1957) found that pre- 
test scores of students did not differ significantly from chance. These 
findings suggested that pupil scores on a pretest in the present study 
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probably would not have differed 1 significantly from chance. 

According to Campbell and Stanley (1963), the Posttest-Only Design 
is preferred to the Pretest-Posttest Design because it controls for 
the effects of the pretests. Pretesting may have been a confounding 
variable in the proposed study. The Posttest-Only Design also required 
only two treatment groups, thereby resulting in larger sample sizes 
than would have been possible if otheY research designs had ^fbeen selected 
which required more than two treatment groups, such as the Solomon Four 
Group* Design. 

As a practical matter, moreover, teachers and students react 
negatively to the administration of pretests on subject matter with 
which they have had no systematic instruction.^ Informal observations 
with studies using pretests (Greene, 1965; Thomas, 1967; and Walsh, 
1967) indicate that the administration of a pretest can lead to a 
hostile attitude on the part of students to an experimental study. 
Since the population selected as mastery and non-mastery subjects were 
not accustomed to using, self-instructional materials over long periods 
of learning, procedural treatment prudence as well as design considera- 
tions supported the desirability of a posttest-only design. \ 
Rationale for the Concomitant Variable 

« In the conduct of experimental research, standardized measures 
may be used for a variety of purposes— to predict pupil, achievement, to 
match sample to reading leverof material, to describe pupil cognitive # 
variables, and to establish concurrent validity of the instrument 
developed by. the investigator. Since 1965, a continuing concern of the 
Anthropology Curriculum Project and the Geography Curriculum Project has 
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been to develop materials for pupils in terms of characteristics 
related to school achievement. Reading ability has been consistently 
identified as the most significant ability related to success in school. 
As a practical matter, the typical full scale reading ba,ttery, such as 
in the Iowa Tests, takes more than one class period to administer, a 
practical" matter which interferes with collection of data. The word 
meaning, or vocabulary sections, of most reading or achievement tests 
can easily be administered within the time constraints of one period 
within a classroom. Because of- the high correlation of vocabulary to 
reading, the Anthropology and Geography Curriculum Projects have there- 
fore used knowledge of word meaning, as measured by a vocabulary test, 
\ as an efficient way to collect data for the concomitant variable . 

The concomitant variable selected for this study was knowledge of 
word meaning, as measured by the vocabulary section of the Iowa; tests 
of Basic Skills: Forms 5 and 6 (Lindquist and Hieronymus, 1971). 
Administration time^ is 17 minutes. 

Knowledge of word meaning was selected as the concomitant variable 
because this category correlates, highly both with the ability to read 
and to achieve in school subjects. Russell (1961) writes that many well 
known standardized reading tests, including the Iowa Tests of Basic 
Skills , contain tests of vocabulary meaning. A child's understanding 
and interpretation^ sentences and paragraphs will depend considerably 
upon his knowledge of individual words in the larger units. The Iowa 
vocabulary test was used as the concomitant variable in the present 
study. Knowledge ^of word meaning correlates more highly with reading 
comprehension than any other sub-test of the Iowa battery (Technical 
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Manual, 1956, 1964, 1971). 

The reading test cf the Iowa battery is a reading comprehension 
test (Morgan\ 1959). Knowledge of word meaning is essential to the 
ability to read, and is widely used in the testing of students to pre- 
dict subsequent success in school (Seegers, 1939; Spache, 1943; Traxler, 
1945). 

Knowledge of word meaning is also the subtest on the Binet and 
Weschler that consistently show the highest correlation with the total 
score (Thorndike and Hagen, 1969) and, the first sub-test on the Binet , 
which is used to establish difficulty of testing level. This high 
correlation of success in school with verbal ability has stimulated the 
development of picture-vocabulary tests as abbreviated intelligence 
test devices. Examples of these picture-vocabulary tests ar$ the Full- 

Range Vocabulary Test (Ammons and Ammons, 1948) and the Pejatbody Picture 

/ \ 

Vocabulary Test (Dunn, 1959). / 

The word meaning section of the Iowa Test was also chosen for high 
test reliability and use in Georgia statewide testing. According to 
the 1974 Technical Manual , the grade seven vocabulary test obtained a 
test reliability of .89, while the reading test obtained a reliability 
of .92. The intercorrelation between the vocabulary and reading test 
was .81. The standard error of measurement on the raw scores for the 

vocabulary test was 3.0. With the large sample used in obtaining the 

/ 

reliability data this is a strong indication that the vocabulary test 
was predicting vocabulary level highly. 

The Iowa Test battery is used in the Georgia state-wide testing 
program at Grades 4, 8, and 12. Consequently, the use of the Iowa Test 
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is readily accepted in curriculum research by Georgia teachers and ad- 

/ 

miriistrators. , 

Unit of Statistical Analysis 

The researcher, in this study, had three choices when a unit of 
statistical analysis was chosen. The choices were the individual, 
the classroom, or the aptitude group. The individual was not the focus 
in this study even though the treatment materials were self-instructional. 
The classroom should have been the unit of statistical analysis; however, 
this would not have allowed an analysis of the relationship between the 
three aptitude levels and treatments. Therefore, the aptitude group, 
was used as theuinit of statistical analysis. The procedure for ran- 
domization and cell assignment is described in t he se ction "Random 
Assignment." ~ 

Experimental S tudy f~ 
This study compared self-instructional mastery and non-mastery 
treatments to determine if there were differences in achievement and 
time of high, middle, and low aptitude students on learning, retention, 
and times-to-testing. ^ 
Sample Selection 

Dr. Marion J. Rice, Director of the Georgia Geography Curriculum 
Project, made arrangements with officials of the Savannah-Chatham 
County Public Schools in Georgia to obtain 20 Grade Seven classes (539 
students) in four schools for the experimental study. 
Random Assignment of indivi duals to Treatment Groups 

There were five steps in the randomization process. First, all 
students were administered the word meaning section of the Iowa Tests 
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Table 4.2 



Number of Students by Reading Level by Class and Treatment 
Including those Students Omitted from Data Analysis 



Classroom 

Treatment 
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The bracket ( ) indicates the number of students dropped from the study. 



The square □ indicates the number of students not used for data 
analysis. 
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s 

of Basic Skills : Forms 5 and 6> (Lindquist and Hieronymus, 1971)- Stu- 
dent scores were rank ordered and a mean and standard deviation was 
computed for the group. Second an a priori decision was made to select 
reading aptitude groups within classes based upon the mean of student 
scores, creating cells of unequal Ns. 1 The mean of the group was 19.03 
and a quarter of a standard deviation on either side of the mean formed 
the middle reading aptitude group. The gaps between one quarter and 
one half standard deviations above and below the mean were used as clear 
differentials between the three levels of reading aptitude. Students 
falling within these deviations participated in the study but. they were 
excluded in the data analysis. The high reading aptitude group was 

0 

comprised~of students whose scores were greater than one half standard 
deviation above the mean. The low reading aptitude group was "comprised 
of students whose scores were more than one half standard deviation 
beJow the mean. Third, students were then sorted back into their 
classes maintaining their respective aptitude grouping. Fourth, 
classes were then randomly assigned to one of two groups. Fifth, treat- 
ment was then randomly assigned to the groups. 
Distribution of S tudents b£ Treatment and Aptitude 

Twenty grade seven classes (539 students) were selected for this 
study. Students within classes were distributed as displayed in Table 
4.2. All 539 students were not used in the study. k There were twc basic 
reasons why some students were not used. First, when the treatment by 
levels was set up on the concomitant variable some student scores on 
the word meaning test fell into the groups between aptitude levels. 
This occurred because this study required a clear differentiation between 
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the high, middle, and low aptitude groups.. This resulted in -19 per 
pent of the population being omitted from the data analysis. 

Second, students were deliberately omitted from the analysis for 
the following reasons: 

I 1. Consistent and prolonged absences from school for more than 
ten of the twenty instructional days. Absences due to sickness or 
suspension were the only explanations excepted in this category. 

2. Students had moved away from the school and either did not 
complete the unit materials or could not complete the final tests. 

This resulted in 7 per cent of the/population being omitted from 
the data analysis of which, 26 per cent came from the high aptitude group, 
10 per cent came from the middle aptitude group, and 64 per cent came 
from the low aptitude group. 

Reading Scores, Grade Equivalents, and National Percentile Rank 

Students distributed by aptitude contained the following charac- 
teristics. High ap±itude students were reading equivalent to grade 
level. Middle aptitude students were approximately two grade levels 
lower, while low aptitude students were four grade levels below actual 
grade level (see Table 43 and 4.4 for aptitude and grade equivalent 
levels. The grade equivalent scores translated to national percentile 
ranks indicate that the high aptitude/group fell in the 58th percentile 
rank, the middle aptitude group fell' in the 25th percentile rank, and 
the low aptitude group fell in the 3rd percentile rank (see Table 4.5). 
These scores indicate that most students used in this study were below 
the national norm for reading as measured by the word meaning section, 
Iowa Tests of Basic Skills- 
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Table 4.4 

Mean Reading Scores, Standard Deviations, 
and Grade Equivalents for Aptitude Group*; 



Aptitude 


Mean 
Scores 


S. D. 


Grade 

Equivalents 


High 


29.4 


5.26 


8.1 


Middle 


19.1 


2.32 


6.3 


Low 


10. 7 


2.83 


3.9 




Tatne-475" 




Mean Reading Grade Equivalents and 
National Percentile Ranks for Aptitude .Groups 


Apti tude 


Grade Equivalent 


Percentile Rank 


High . 


8.1 


58 


Middle 


6. 


3 


25 


Low 


3. 


9 


3 



Orientation of Teachers 

The researcher supplied each teacher and principal of the four 
cooperating junior high schools copies of the text and workbook 
Functions of Cities and written instructions regarding procedures. 
Because the teachers were not required to teach students the treatment 
unit no attempt was made to train the teachers in any aspect of the 
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treatment material- However, because all learning material was included 
in the text and workbook teachers were expected to keep abreast of the 
content, including the Thought Questions. Thought Questions could be 
used as points for discussion on the mid-week break. 
Duration of the Study 

The study was conducted over a 20-day instructional period from 
April 4th to May 7th, 1974. During this period both treatment groups 
studied Functions of Cities . At the end of the 20-day instructional 
period a geography achievement posttest was given to both treatment 
groups. A delayed posttest of geography achievement was administered 
on May 24th, 1974, 17 days after the conclusion of treatment to measure 
retention. 

Pattern of Logic Used in the Study 
A 3 x 10 x 2, aptitude by classes-nested-within- treatments, by 
treatments, multivariate analysis of variance was used with learning, 
retention, and times-to-testing as the effects measures. Factors 
included two treatments and three levels of aptitude. This experimental 
design Was depicted earlier on page 69. 
Reseflfrch Hypotheses 

, ^ The major purpose of this study was to compare self-instructional 
mastery and no/i-mastery treatments to determine if there were differences 
in achievement and time of high, middle, and low aptitude students. 
The main hypotheses investigated were: 

1. The mastery and non-mastery treatments will produce differences 
in the average effects which are not the same (p<.05) at the high, 
middle, and low aptitude levels measured by geography posttest of: 
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(a) learning, 

(b) retention 
and a measure of, 

(c) times-to-testing 

2. With pupils pooled across the three levels of aptitude the 
difference between the mastery and non-mastery treatments will produce 
differences (p<.05) in the average achievement measured by geography 
posttests of: 

(a)' learning, ■ 
(i) retention 
and ^measure of 
/ (c) times-to- testing. 
/ 3, With pupils pooled across the two treatments, there are dif- 
ferences among the three level s-of-aptitude vectors of average effects 
(p<.05) measured by geography posttests of: 

(a) learning, 

(b) retention, 
and a measure of 

(c) times-to-testing. 

Pattern of Logic for Testing the Research Hypothesis Statement 

Statement Logic Pattern Source 



If the research hypothesis is true 
then the observed differences of 
average effects will not be the 
same across the three levels of 
aptitude. 



If A, then B 



Assumption 



000<J5 



For these differences in average 
effects to be found different 
across the three levels of aptitude 
in the context of the research 
hypothesis being false is very 
unlikely. 

The differences were found not to 
be the same across the three levels 
of aptitude. 



B without A is 

extremely 

unlikely 
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Assumption 



A is much more 
credible*, 



Polya 

Pattern IV 



All research hypotheses .followed the same pattern of logic. 
Discussion of Pattern of Logic ^ 

The pattern of logic used as a base for the ^proposed study claims 
that it is extremely unlikely for differences in t|ie average effects 
to be found different across the three levels of altitude without the 
hypothesis being true. This claim can be considere^ to be probable 
only if the personal attributes of the subjects and ^contextual attri- 
butes other than treatment are eliminated as possible causes for the 
differences. 1 

In the proposed studyrpersonal attributes of the subjects can 
be eliminated as a probable cause of the probability df a Type 1 error 
' (p<.05). This is true because of the randomization factor in the re- 
search design. The personal attributes of the subject^ other than 
reading aptitude are randomly distributed along with th|p assignment of 
individuals to treatment groups. While randomization does not ^ensure 
that the two groups are perfectly matched on all variables which might 
influence the results of the experiment, it does guard algainst the 
danger of systematic biases in the data (Myers, 1966). 

The research design does not .take into account contextual or 

i 

situational variables that might cause a difference between group means 
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The researcher dealt with these variables in two ways. Whenever possible, 

direct control of the variables was exercised over such influences 

as treatment materials, directions to teachers, and test administration, . 

Where direct control is impractical, variables, e.g. school^ 
organization, teacher experience, physical plant or class size, were 
observed and described systematically. 

The direct control of certain contextual variables with the two 
treatments makes it highly unlikely that those variables caufsed dif- 
ferences between the means of each treatment group in^Ke^tudy. It 
was also assumed that variables that were observed ^nd described rather 
than controlled did not cause a f difference in the means of the two 

treatment groups if the variables did not differ greatly between groups, 
/ i / 
i Within p\e limits described above, it is logical to claim that any 

differences in means can probably be attributed to treatment differences, 

thereby making the assumption more credible. 

In the event that the average effects are the same across aptitude 
levels the claim can still be considered probable due to the control 
exercised over the subjects and contextual attributes other than treat- 
ment which may have accounted for differences in the average effects. 

Due to the limitation of experimenting, with existing classes which 
functioned within the framework of the school and the school system, 
there were some contextual variables that could not be controlled by the 
researcher. The contextual variables are described in the following 
section. 
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Contextual Variables - 
The contextual variables which could not be controlled included 
the effects of the .community, school district, school, and the teachers. 
Community and School District \ 

The study was conducted in. the Savannah-Chatham County ^Public 
schools. The population of Chatham County is approximately 209,000. 
-The economic base of the city and county is the harbor and docks with 
the military and manufacturing other important activities. 

v. The student enrollment in the Savannah-Chatham County Pub>i< 
schools was 33,606 as of February 28, 1974. This total systemwide ^ 
enrollment was composed of 19,292 elementary grade students, 13,353 

» * * * 

secondary grade students, 668 elementary special education students, 
Snd 143 secondary special edi-catioh students. There are 17 secondary 
and 42 elementary schools (B. Hirshberg, personal communication. April 

2, 1974). \ 

The school system is under court border to maintain racial balance 
of faculties and students in every .school. This racial balance was 
achieved by pairing schools with predominant black and white student 
bodies. Bussing was used to facilitate this equality of racial compo- 
sition. During the time that this study was conducted, principals and 
teachers indicated no incidents o\ racial tension among t£e students. 
Characteristics of the Schools jjn Study ' 

The twenty classes that participated in this study were located 
in four schools in the Savannah-Chatham County School District. These 
schools contained the following characteristics. 
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School A The original construction of the school was completed, 
in 1963. No additions have been contemplated since 1963. There are 
24 regular classroom teachers, one special remedial teacher, and one 
\ibrarian at the school. The school was "administered by an appointed 
principal. 

Classes at all grade levels -(7-9) were heterogeneous ly grouped. 
The racial composition of the school was 54 per cent black, and 46 per 
cent white. Socio-economically, the geographic area around the school 
was below average and the area was under Title 1 funding. The principal 
reported that racial tension was not a problem in the school. 

School B. The school was constructed in 1960. There were 31 
regular classroom teachers and one special teacher. The school was 
administered by an appointed principal. \ 

The classes were self-contained; however, they were purported to 
be homogeneous. The word 'homogeneous' was used in the sense that a 
racial balance was maintained in each class. The racial composition of 
the school was 50 per cent black and 50 per cent white. Approximately 
50 per cent of the school population came from the middle and lower 
middle class areas around the school, while the other 50 per cent were 
bussed from economically deprived areas. The principal reported that 
racial tension was not a problem in the school. 

School C. The original constructipn of the school was completed 
in 1959. The junior high school is adjacent but integrated with the 
senior high school next door. There were 28 regular class room. teachers 
at the junior high school but there were no special teachers. The 
school was administered. by an appointed principal. 
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The classes were self-contained and heterogeneous. The racial 
composition of the school was 58 per cent black and 42 per cent white 
with students coming from the lower, upper lower, and middle socio- 
economic areas. The principal Jndicated^that the school appeared free 
from racial tension. • 

School D. Construction of the school was completed in 1962. There 
were 46 regular classroom teachers, 2 special teachers, and a librarian. 
The schobl was administered by an appointed principal. 

Classes were self-contained and heterogenous. The racial compo- 
sition of the..,school was 45 per cent black and 55 per cent white. Stu- 
dents came from lower and lower-middle socio-economic areas. The 

principal did not indicate that racial differences had created any 

I 

'problems. * / . 

Characteristics of the. Teachers in the Study 

■ Five grade seven teachers from the Savannah-Chatham County School 
District participated in this study. „ The researcher spent nine days in 
Savannah while the study was in progress and during this time consider- 
able observation of classroom and material management was made. The 
following analysis arose from "written teacher responses to a questionnaire 
and researcher observations. 

Teacher A. This teacher was the eldest of the group, female, and had 
taught for 25 years. 

This teacher held a Bachelor of Science degree with a major in 
social studies. She reported that she had taken nine courses in geography 
and had attended some geography workshops: 

Teacher B. This -teacher was in the mid-twenties, male and was teaching 
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for/the first time. 

/■ He held a Bachelor of Science degree with a major in Physical 
Education aYid a minor in social studies. He had completed one" course 

. • \\ . • 

in geography. 

Teacher C. This teacher was in the mid-twenties, female, and had four 
years teaching experience. 

She held a Bachelor of Science (Education) degree with a major in , 
Social Science Education in geography. She had completed 55 quarter % L 
hours in geography. 

Teacher D. This teacher was in the mid-twenties, female and had four 
years teaching experience. 

- She held a Bachelor of Science degree in Education with a major in 
social studies. She had completed one course in geography. > 
Teacher E. This teacher was in the mid-twenties, male, and had three 
years teaching experience. 

He held a Bachelor of Science degree in Education with a major in 
social science. He reported that he had completed 10 hours in geography. 
Stmmary of Context ual Variables 

The four schools that participated in the study were similar in 
organization, administration, plant facilities, and student populations. 
All four schools and aVl.20 classes were racially integrated. Class- 
rooms were self-contained. However, each teacher taught more, than one 
class. Table 4.6 indicates the teacher and number of classes taught 
involved in this study. 
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Table 4.6 

■ Teachers and the Number of Classes Taught 



Teacher 


Classes Taught 


A 


2 

r 


B 


. 5 


C 


5 


D 


4 


E 


4 

» t 

V 

Totai f " 20 



The observed differences between the two treatment groups regard- 
ing the personal attributes of the> teachers were deemed to be minor 
because all but one teacher taught classes in both treatments, There- 
fore, the researcher concluded that there were no contextual variables, 
other than treatment, that accounted for observed differences between 
the two treatments on the posttests. 

S tatistical Procedures 

A 3 x 10 x 2, aptitude by classes-nested-wi thin- treatments^ by 
treatments, multivariate analysis of variance (MANOVA) was used with 
the learning, retention, and times-to-testing mean scores as the effects 
measures. This experimental design was used to determine if the dif- 
ferences between the mastery and non-mastery treatments produced dif- 
ferences (p<.05) in the average effects which were not the same at the 

• \ 
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high, middle, and low aptitude levels. The computer program used in 
the above analysis was the BMD 12V (Biomedical Computer Programs, 
1973). This program can perform multivariate and univariate analyses 
of variance for any hierarchical^ design with cells that contained 
equal Ns, including nested, partially nested and partially crossed, 
and fully crossed designs. While a, Multivariate, Univariate, Discrimi- 
nate Analysis^oTlndependent Data (MUDAID) program had been considered 
as the program for analysis, of the data in this study it was found that 
if could not handle designs that included' a nested factor. Consequently, 
the BMD 12V program was used because the independent variable, class, 
was nested within treatments. ~" 
S tatement of the Statistical Hypotheses* 

The purpose of this study was to compare self-instructional 
.-ma'stery and non-mastery treatments to determine if there were dif- 
ferences in achievement and time of highi middle, and low aptitude stu- 
dents, usfng measures of learning, retention, and times-to-testing. 

To accomplish this purpose, the following statistical hypotheses 
were tested at the .05 level of significance. The subscript order is 
the same as .that used on the experimental layout Table 4.1, p. 69. 
Hypotheses for MANOVA 
1. Interaction : Treatment by Aptitude 



(Vectors of the 
high aptitude 
group by treatment) 





Kill 




*.121 


H 0 : 


^.112 




M .122 








^.123 



ERIC 



00103 



90 



9 

ERIC 



M .211 




*\221 


y .212 




.222 


if. 215. 




if.223_ 


V».311 




^.321 


y .312 




^.322 


if. 313 




if.323_ 



(Vectors of the 
middle aptitude 
group by treatment) 



(Vectors of the low 
aptitude group by 
treatment) 



. This null hypothesis states that the average difference between 
treatment effect vectors is the same at each aptitude level. This 
null hypothesis was tested. against the two-tailed alternative hypothesis 
that the average difference betwpsn treatment effect vectors is not the " 
same at each aptitude level. 
2. Main Effects ; Treatments 





*..ll 




V21 


H o : 






V.. 22 




^..13_ 




if- 23. 



This null hypothesis states that with pupils pooled across the 
three levels of aptitude, there is no difference between the mastery 
and non«mastery treatment vectors of average effects. This null 
hypothesis was tested against the two- tailed alternative hypothesis 
that there is a difference. 
3. Main Effects : Aptitude, 

v.iS\ p.2.1 

H 0 : v .1.2 = y .2.2 
V.1.3J £.2.3 



^.3.1 
^.3.2 
Lbi.3.3 



This null hypothesis states that with pupils pooled across the 
two treatments, the vectors of achievement are the same at each of the 
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three aptitude levels. This null hypothesis was tested against the 
twortailed alternative hypothesis that with pupils pooled across the 
two treatments, the vectors of achievement are not the same at each of 
the three^ aptitude levels. 

If statistical significance was found on the multivariate inter- 
action hypothesis, an a priori decision was made to follow up by. • 
testing the univariate interaction hypotheses. No main effects were 
to.be tested for, for the multivariate or univariate analyses. If 
there was no statistical significance on the multivariate interaction 
hypothesis, each of the multivariate main effects was be tested. 
If these were statistically significant then the decision was to follow- 
up by testing each of the effects measures- at the univariate level. If 
there were no statistically significant" multivariate main effects then, 
no follow-up /tests were planned (Hummel and SI i go ; 1971). Duncan/s 
Multiple Range Test was. the appropriate post hoc test for statistically 
significant outcomes for the univariate analyses (Edwards, 1968), while 
the Bonferroni t statistic was the appropriate post hoc test for simple 
effects (Marascui.lo and Levin, 1970). 

4. Interaction ; Tr eatment by_ Aptitude (Posttes t) 

H 0 : ^.111 - y .121 = y .211 - y .221 = y .311 - ?.32.1 
- This mill hypothesis states that the difference in average effects 
of the two treatments is the same at each aptitude level. This null£? 
hypothesis was tested against the two-tailed alternative that the dif- 
ference in average effects of the two treatments is not the same at 
each aptitude level- ~ „ 

5. Main Effects: Treatments (Posttestl 



/ 

/ 
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H : = ^..21 

■ » o 

This null .hypothesis states that with pupils pooled across apti- 
tude levels there are no statistical differences between treatments 
on the mean posttest scores. This null hypothesis was tested against 
the two-tail ed-al tentative hypothesis that with pupils pooled across, 
aptitude levels there are statistical differences hefween treatments 
on the mean posttest scores. < C~^^ 
6. Main" Effects : Aptitude (Posttest) 

« 

o 

This null hypothesis states that with pupils pooled across, the 

/ 

two treatments there are no statistical differences between altitude 
groups on the mean posttest scores. This null hypothesis was tested 
against the: two- tailed alternative that with pupils pooled across the 
two treatments there are statistical differences between aptitude 
groups on the mean posttest scores. 

The hypotheses for the analysis of variance for each of the effects 

measures followed the same format. Therefore, it was not necessary to 

\ 

state each set of hypotheses because of the repetition involved. The 

1 

same hypotheses were applied to the meusures of retention and times-to- 
Resting. 

' I r 

\ 

Hypo theses for Simple Effec ts 

13, 14 i 15. Simple Effects : Learning 

y .Ul = y .121 
H Q : y .211 = y .221 

^.311 « y .321 

» • 
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These null hypotheses state that at each of the' three aptitude 
levels there is no difference between the two treatment means on the 
posttest. Ea3i hypothesis was tested against its alternative that 
there are differences between treatment means across each level of. 
aptitude orTthe^posttest. 

The hypotheses for simple effects for each of the effects measures 
followed the same format. Therefore,' it was not necessary to state 
each set of hypotheses because of the repetition involved. The same 
hypotheses were applied to the measures of retention and times-to- 
testing . ; 
Significance Level 



In the present study the 
ing the null hypotheses. This 



05 significance level was used in test- 

eant that a difference as large as or 

larger than the obtained one cahld occur by chance as infrequently as 

5 times out of 100. Therefore, the probability of rejecting a true 

statistical hypothesis (Type 1 or .a error) is .05. 

A type II error (B) is- the failure to reject a false statistical 

hypothesis. The relationship between a (Type 1 error) and B (Type II 

errpr) is inverse. Decreasing the probability of a Type 1 error 

increases the probability of a Type II error. The selection of a signif 

<\ 

icance level, therefore, reflects a compromise between the relative 

importance of the two types of errors (Myers, 1966). 

The power of a statistical test is defined as 1 - B, or the prob- 
ability of rejecting a statistical hypothesis when it is false and 
should be rejected. If ct (Type 1 error) is held constant, the power of 
the significance test can be increased, by increasing the number of 

\ 
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observations in the sample; therefore, if power is increased, then the 
probability of p (Type J I error) is decreased (Edwards, 19681. 

By selecgnrirlignificance level of .05' instead of one that is 
higher feg .01), the probability of making a Type II error is reduced. 
The .05- level, however, is strong enough to warrant concluding that the 
difference is not attributable merely to sampling errors'. In cases 
where small sample sizes are used, Walker and" Lev (1958) have stated 
that the level of "significance should not be high because both factors 
reduce the power of a test. 'However, in the case of the present study 
where the sample size was larger (n=60), a ..05"level of significance 



was considered appropriate. 
Assumptions Un derlying the Multivariate Analysis of Variance (MANOVA), 
The MANOVA was used as on^e method of data analysis^ in. this, study 
because it was' appropriate for testing the significance of differences 
between treatment means in termsW three dependent variables considered 
simultaneously (Tatsuoka, 1971). In order for MANOVA to be an appro- 
f priate test of the statistical hypotheses, the data must have met 
certain assumptions: 

1. The variables under study must follow a multivariate normal 

distribution. , V| 

2. There must be equal dispersion matrices. 
Assumptions Underlying the Analysis of Variance (ANOVA ) 

The Ancva was used as the second method of data analysis in this 
study because it permitted a straight forWard analysis of the hypothesis 
under consideration (Myers, 1966). In order for ANOVA to be an appro- 
■ priate test of the statistical hypothesis, fhe data must meet the 
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following assumptions: , 

1. The deviation, due to uncontrolled'variability, of the indi- 
vidual mean scores from the treatment group^opulation mean are inde- 
pendently distributed; 

2. the deviation, due to uncontrolled variability, of the indi- 
vidual mean scores from the treatment. group population mean are normally 
distributed; 

,13., the variance of the deviations, due to uncontrolled vari- 
ability-, is the same for a] 1. treatment grthip populations; 

4. the null hypothesis is true (Myers, 1.966). 

If the first three assumptions are valid, then a significant F 
may be attributed to the falsity oft the fourth assumption (Myers, 1966). 

To meet the assumptions underlying .the F test, the following pro- 
cedures were used: 

1. The validity of the independence assumption was met by the 
random assignment of classes to two groups and then random assignment 
of .treatment to groups. 

. 2. The validity of the normality assumption depended on the 

measure Chosen by W as of no concern since Norton (1953) had shown 

\ \ 
that the F ratio is little influenced by departures fronrnormality 

(Myers, 1966). • " \ 

3. The validity of the homogeneity of variance assumption was 
tested by using Hartley's test, and the data met this requirement for 
the achievement measure. 

' Limitations' 

- — 1 — . — - 

The present study was limited to an investigation of the compar- 
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ative effects of the self-instructional unit Functions of Cities in a 
mastery learning and non-jnastery learning mode using three measure^, 
learning, retention, ^andtimes-to-testing. It was further limited to 
the effects of materials, which met the.criteria specified by the 
researcher in Chapter 3; J 
A second limitation of the study was that while the researcher 
spent, on average,* two days a week observing and assisting in the 
schools there could be no check ifiade to ensure that written* and oral 
directions were being carried out by the teachers and students in the 
stucjy. Oral, and written directions were provided prior to the beginning 
of treatment to each teacher along with a sample copy and classroom 
set of the materials in the format to be followed. Classrooms Were 
-visited regularly each week and teachers reporte^no irregirtarities. 
The. e procedures' strengthened the assumption that the teachers and 
students followed thje instructions butlined, but the degree to-which 
individuals may have deviated from the established procedures cannot 
be determined. 

A third limitation of the study was the use of an available pool 

u 

of 539 seventh grade students in 20 classes in the Savannah-Chatham 

* - * 

County School District* This population was not representative of a 

, national sample. The subjects were below the national average in 

reading word knowledge as measured^ by the Iowa Tests of Basic Skil ls. 

i " ' — 

*' - x ' ' 

In addition, the available £ool^ of students did not follow the national 
ratio with regard to racial composition. 
48 per cent of the students were^black. 
higher than the national percentage of 12 



In this study, approximately 
his- percentage is considerably 



.2yfor metropoTttan^areas 
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(U. S. Bureau of Statistics,' 1971). 

(A fourth limitation of the study was the necessity to use the 
same form of the measuring instrument to measure retention as was used 
to measure initial achievement. However, two aspects of the present 
study mitagated against a carry-over effect from the posttest. First, 
there was a planned-i7 day interval between the^dministration of the 
posttest and the delayed posttest with no student feedback during this 



period. ' Second, students were not informed that they were to be 

A fifth limitation of the study concerned the lack of a pilot 
• testing phase in the development of the materials and measuring instru- 
ments. This was due to the press of time in getting the materials into 
the schools for actual administration. However, a number of control 

i 

steps were conducted that the researcher hoped would help offset the - 
disadvantages of no pilot phase. The, lack of a pilot test phase is 
acknowledged as another limitation. { 

A sixth limitation of the study wafc the use made of the. mid-week 
discussion class. The activities that teachers and students indulged 
in were Wide ranging. Originally, it was proposed that content-oriented 
activities would be used in the classroom. However, the researcher 
observed that slides, filmstrips, and films were utilized along with 
library activities. Many activities did not have specific bearing on 
^fcffe unit content. At other (;imes students were permitted to work on 

their workbooks. This provided more time for students to work with the 
-^materials. Therefore, some students had more time to learn the content. 

A seventh limitation of the study was the possibility that students 
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did not respond independently. In order to maintain normal classroom 
learning conditions students were not reassigned to different desks 
in association\with their respective aptitude levels. Consequently, 
there may have been Interaction between students across aptitude 
groups in spite of the materials .being self-instructional. An arti-~ 
ficial, experimental environment Was also avoided and the focus of 

the study, the effects upon achievement of students grouped by aptitude 

V - 

was able to be conducted and examined. 

An eighth limitation was that the times-to-testing scores should- 
.have been transformed to eliminate the possibility that the ratios of 
the means and those of the variances were similar. The BMD 12V program 
was unable to transfer scores and consequently^the mean cell scores " 
and variances- for times- to- testing may have been similar. This imposed 
a limitation for the data analysis. 

Summary 

This chapter presented a 3 x 10 x 2, aptitude) by classes-nested- 
wi thin- treatments, by treatments, multivariate analysis of variance 
(MAN0VA-.) as the experimental design of the study. The main purpose 
of the study was to compare self-instructionaO mastery and non-mastery 
treatments to determine if there were differences in'" achievement and 
time of high, middle, and low aptitude students. Following the dis- 
cussion of the experimental study, a description of the pattern of 
logic used in the study was provided. Factors that could not be con- 
trolled for statistically, were discussed and described as contextual 
variables. 

Following the discussion of the contextual variables was a 

00112 



description of the procedures used in the experimental study and the 
limitations to the study. Data obtained in the experimental study 
were used to test the statistical hypotheses. The results of the tests 
within the limitations of the study are presented in the next chapter. 
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CHAPTER V 

RESULTS AND DISCUSSION OF THE FINDINGS 



The purpose of this chapter is to report, analyze, and discuss 
the data collected in the present study. The chapter is divided into 
two sections: 1),! Presentation of the Findings; and 2) Discussion of 
the Findings, fables 5.1, 5.2, and 5.3 present the raw cell mean data 
that was used in the multivariate analysis and subsequent data analyses. 

Presentation of the Findings 

The findings for the study are reported separately for each tested 
hypothesis. 

Analysis of the Data by the Multivariate Analysis of Variance (MANOVA) 

Analysis of the data by the BMD 12V program produced the following 
'outcomes displayed in Tables 5.4, 5.5, 5.6, and 5.7. 
Findings of Hypotheses for MANOVA . 
1. interactions : Treatment by Aptitude 
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Table 5.4 



Multivariate Analysis of Variance Test of Significance - 
Learning, Retention, and Times-to-Testing 



Source of 
Variance 


Degree of 
Freedom 


Approximate F 
Statistics 


P< 


Treatment 


/ 3, 34 


14.8220 


.001 


Aptitude 


j 6, 68 
/ 54, 102' 


U.9901 


.001 


Class (Treatment) 


3.0536 




Treatment x Aptitude^ 

/ ' / 
CI ass- x Apti tude / 

(Treatment) / 


6, 68 . 


1.0163. 


NS* 









/ 



*NS = Not significant p<.05 

/ Table 5.5 



Analysis' of Variance for Treatment, Aptitude, and 
Interaction - Learning 



/ 

Source of 
Variance 


Sum of 
Squares 


Degree of 

Freedom 

/ 


Mean 
Square 


F 


Treatment 


105.75 


i * 


.105.75 


2.99 


'Aptitude 


3987.03 


2 


1993.51 


56.39* 


Class (Treatment) 


2060.33 


18 


• 1.14.46 


3.24 


Treatment x Aptitude 


203.20 


2 


101.60 


2.87 


Class x Aptitude 
(Treatment) 


1272.64 


36 


3*5.35 





♦Indicates F ratios that are significant at the .05 level. 
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Table 5.6 

Analysis of Variance for Treatment, Aptitude, and 
Interaction - Retention 



Source of 
Variance 


Sum : of 
Squares 


Degrees of 
Freedom 


Mean 
Score ' 


F 


Treatment 


553.65 


1. 


553.65 


lfi ?ft* 

1 D • CO 


Aptitude 


4540.. 26 


2 . 


2270.13, 


lei* 

00 • it 


Class (Treatment) 


1782.13 


18 


99.01 


2.91 


Treatment x Aptitude 


140.61 


2 


70.31 


2.0/ 


Class x Aptitude 
(Treatment) 


1224.57 
i 


36 


34.02 




indicates F ratios that are significant at the .05 level. 


> • 

* \ 




Table 5.7 




* 


^ Analysis of Variance for Treatment, Aptitude,. and 
. Interaction - Times-to-Testing 




Source of 
Variance 


Sum of 
Squares 


Degrees of 
Freedom 


Mean 
Square 


F 


Treatment 


85744.19 


1 


85744.19 


26.60* 


Apti tude 


1238.94 


2 


619.47 


0^19 


Class" (Treatment) 


455231.44 


18 


25290.63 


7.85 


Treatment x Aptitude 


2162.13 


2 


1081.06 


0/34 


Class x Aptitude 
(Treatment) 


116027.38 


36 


3222.98 





♦Indicates F ratios that are significant at the .05 level. 
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^.311 




^.321 


' 1 


^.312 




U .322 




J^.313 




i!-323 



(Vectors of the 

low aptitude 

group by treatment). 



This statistical hypothesis, that the average difference between 
the two treatment vectors of average effects is the same at each apti- 
tude level, was tested against the alternative hypothesis that the dif- 
ference between the two treatment vectors of average effects is not the 
same at each aptitude level. 

The multivariate F statistic for interaction of treatment and 
aptitude was not significant (see Table 5.4, p. 104)* The null hypothesis, 
therefore, was naf rejected at the .05 level of significance. As the 
multivariate interaction null hypothesis was not rejected the univariate 
interaction hypotheses were not tested. Therefore, separate reports 
for the univariate analyses for the interaction hypotheses 4, 7, 10 
are not presented. . 
2. Main Effects ;\ Treatments . - 









^..21 


V 


^..12 




*\.22 




1J..13 




2^. .23 
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This statistical hypothesis, that with pupils pooled across the 
three levels of aptitude there is no significant differences between 
the mastery and non-mastery treatment vectors of average effects, was 
tested against the alternative hypothesis that there is a difference 
between the mastery and non-mastery treatment vectors of average 
effects . 
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The multivariate F statistic for the treatment effect was signi- 
ficant (see Table 5.4, p. 104). Therefore the null hypothesis was 
rejected and the alternative hypothesis was accepted. 
3. Main Effects : Aptitude 









»».2.1 




"^.3.1 


» 

' H • 

n Q . 


w .3*2 




U.2.2 




U .3.2 








H-3.3 




ii.3.3. 



This statistical hypothesis, that across the two .levels of treat- 
ments there are no differences between the three level s-of -aptitude 
vectors of average effects, was' tested against the ^alternative 
hypothesis that across the two levels of treatments there are dif- 
ferences .between the three level s-of-aptitude vectors of average effects. 

The multivariate F statistic for aptitude effects was statis- 
tically significant (see Table 5.4, p. 104). Therefore the null 
hypothesis was rejected and the alternative hypothesis was accepted. 
Findings of Hypotheses for ANOVA 
5. Main Effects : Treatments (Learning) 



H: W.;n 
o 



V21 

/ 

/. 



This statistical null hypothesis, that there is no statistical 

significant difference between treatments on the mean posttest scores, 

/ 

was tested against the alternative hypothesis that there are differences 
between treatments on the mean posttest scores. 

The univariate F statistic for treatment was not significant 
(see Table 5.5, p. 104). The null hypothesis, therefore was not rejected 



9 
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at the .05 level of significance. 

6. Main Effects: Aptitu de (Learning) 

H Q : =' = \ y .3.1 

This statistical null hypothesis, that there are no statistically 
significant differences amons aptitude groups on the mean posttest 

i 

scores, was (tested against the alternative hypdtfiesis that there are 
such differences. 

The univariate F statistic for aptitude was significant (see 
Table 5.5, p. .104). 

To determine which pairs of aptitude means were significant the 
Duncan Miltiple Range- Test was applied to the univariate cell matrix 
data to locate the source of the significant effect. Table 5.'8 reports 
the results for aptitude effect on the posttest. ;. 

Table 5.8 
Learning Mean Scores by Aptitude 



Aptitude Group 


' N 


Mean Score 


High 


' 20 


36.34 


Middle 


20 


27.05 


Low 


20 


16.39 



The results of the test are reported in Table 5.9. All differences 
were statistically significant. 
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Table 5.9 

• 


Learning: Summary of Results of the Duncan Multiple Range 
Test at the .05 Level of Significance for. Aptitude Effect 


Pairwi.se Comparisons 


Significance 


1 VS 2 


.05 


1 VS 3 


.05 


2 VS 3 


.05 



8. Main Effects: Treatment (Retention) 

- — — — , $ 

H 0 :'^'..12 = ^..22 

This null hypothesis, that there is no. difference statistically 
between treatments on the mean delayed posttest scores, was tested 
against. the alternative hypothesis that there is such a difference. 

The univariate F statistic for treatment was significant (see 
Table 5.6, p.105). The univariate cell matrix was examined to determii 
the cell of significant treatment. Treatment 1 (mastery) was signifi- 
cantly larger than treatment 2 (non-mastary). Table 5.10 shows the 
difference between the two treatment means. 



00123 



no 



'.Table 5.10 

/ * 

Retention: Mean Scores for Treatments 

I 



Mastery" 
. Treatment 1 

' 1 • . . 


Non-Mastery 
Treatment 2 


n 

9 




30 " 


30 


Means 


1- • 
29.05* • 

* ! " 


22.98 



*Thei difference between the mean is significant at the .05 level of 

j 

significance. 

9. Main Effects : AptiWde (Retention) ' 

1 H 0 : H.1.2, =/ <2.2 = V.3.2 

This null hypothesis, that there are.no statistically significant 
differences among aptitude groups on the mean delayed posttest scores, 
was tested against the alternative hypothesis that there are such 
differences. 

The univariate F statistic for aptitudes was significant (see 



Table 5,6, p. 105). 

To determine which 



pair of aptitude means was significant the 



Duncan Multiple Range Test analysis was applied to the univariate cell 



matrix data to determine 
5.11 reports the results 



the source of the significant effect. Table 
for aptitude effect on the delayed posttest. 
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Table 5.11 % 
Retention: Mean Scores by Aptitude 



Aptitude Groups 


Cell Size 


^tean Score 


1 


20 


37.03 


. 2 


20 


25.24 


3 


20 


15.77 



The results of the test^re reported in Table 5.12. All dif- 
ferences were statistically significant. 

Table 5.12 



Retention: Summary of Results of the Duncan 
Multiple RangeTest at the .05 Level of Significance 
for Aptitude Effect 



Pairwise Comparisons 


Significance 


1 VS 2 


.05 


1 VS 3 


.05 , 


2 VS 3 


.05 / 



1 1 . Main Effects : Treatment (Times-to-Testing) 
H 0 : ^..13 = W..23 

\ 

This null hypothesis, that there is no statistically significant 
difference between treatments on mean times-to-testing, was tested 
against the alternative hypothesis that there is such a difference. 

The univariate F statistic for treatment was significant (see 
Table 5.7, p. 105). The univariate cell matrix was examined to deter- 
mine the cell of significant treatment. Treatment 2 (non-mastery) 
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took significantly less times-to- testing than Treatment 1 (mastery). 



Table 5.13 shows the difference between the two treatment means. 

Table 5.13 

Times-to-Testing in Jlinutes: Mean Number of Minutes for Treatments 





Mastery 


Hon-Mastery . 




Treatment 1 


Treatment 2 


n- 


30 


30 


Means 


551 .67 


475.06* 



*The difference between these means is significant at the .05 leyel 
of significance, • 

12. Main Effects : Apti tude (Times-to-Testing) 



H 0 : W.1.3 = u .2.3 « ^.3.3: 

This null hypothesis, that there are no statistically significant 
differences among aptitude groups on the mean times-to^testing scores, 
was tested against the alternative hypothesis that there are such dif- 
ferences. r 

The univariate F statistic for aptitude was not statistically 

significant ^(see Table 5.7, p. 105). The null hypothesis, therefore, 

was not rejected at the-. 05 level of significance. 

Findings of Hypotheses for, Simple Effects 

13, 14j 15. Simple Effects :^ Learning 

i 

H 0 : I 1 . Ill - *.121 ' 

H,211 = y l221 

if / 

Mil = ^.321 

00126 



113 

These null hypotheses state with respect to learning that at each 
aptitude level there is no significant difference between the mastery 
, and non-mastery treatment means . 

To test for significance between treatments across each level 
of aptitude the appropriate post hoc technique was the Bonferroni t 
. test. Marascuilo and Levin (1970) suggest that manipulations of the 
sources of variation and degrees of freedom in what is called a nested 
or simple effects design and tested with the appropriate post hoc 
technique will provide the .necessary information. They claim that, 
"From this, one may justly infer that sums of squares and 

\ 

degrees of freedom, like matter, are neither created nor des- 
troyed, but are merely revealed in different forms." 
To test the simple effects hypothesis a conservative alpha (a) of .10 



was selected and partitioned into three equal sub-parts for eac 



i-pa'rts for each" of 

/ 

the hypothesis. Therefore .10/3 or .033 was the significance level 
used. 

As each set of hypotheses used the^same components, they can be 
stated here as: , * 
EW = the alpha (a) level = .10 
df error » 36 

n = 10 ~ ' 

Bonferroni t = 2.215 

The formula for computing the contrast is ' 



St? - o 
*1 *2 



where, 



i 



2 M.S. error 
'' N 
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M.S. error = mean square for error. 
Bonferroni t statistic 



1 



2 M.S. error • Bonferroni t. 



The Bonferroni t test tests for differences between the cell 
means. The cell means for treatments across each level of aptitude 
are presented in Table 5.14. 

Table 5.14 

Learning: Cell Means and Differences for Treatments Across* 

Each Level of Aptitude 



Mastery 
Treatment 1 


Non-Mastery 
Treatment 2 


Differences 


High 


38.28 


34.40 


3.87 . 


"Middle 


30.26 


23.83 


6.43* 


Low 


15.22 


17.56 . 


-2.34 



♦Significant at the .05 level of significance. 

Application of the Bonferroni t test yielded a critical value of 
5.89. Therefore, a difference as large as 5.89 was significant. The 
results of testing the hypotheses follow. 

The null hypothesis that, with respect to learning; there is no 
significant difference between the mastery and non-mastery treatment 
means for high aptitude students was not rejected. High aptitude 
mastery treatment students did not differ significantly from high apti- 
tude non-mastery treatment students on the posttest measure. < 

The null hypotheses that, with respect to learning, there is no 
significant difference between the mastery and non-mastery treatment 
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means for middle aptitude students was rejected in favour of the alter- 
native hypothesis. The posttest treatment mean for middle aptitude 
mastery students was significantly higher than the posttest treatment 
mean for middle aptitude non-mastery students. ' 

The null hypothesis that, with respect to learning, there is 
no significant difference between the mastery and non-mastery treat- 
ments means for low aptitude students was not rejected. Low aptitude 
mastery students did not differ significantly from low aptitude non- 
mastery treatment students on the posttest measure. 
16,17,18. Si mple Effects : Retention v 

y .112 = p .122 
H Q : V.2U = V.222 
^.312 • = y .322 

These null hypotheses state with respect 'to retention, that at 
each aptitude level there is no significant difference between the 
mastery* and non-mastery treatment means. 

The Bonferrohi t test of significance was used to test for signi- 
ficance between treatments across each level of aptitude. See the 
description of the Bonferroni t test on p. 113. The cell means for 
treatments across each level of aptitude are presented in Table 5.15. 
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Table 5.15 

Retention: Cell- Means and Differences for Treatments 
Across Each Level of Aptitude 



Mastery 
Treatment 1 


Non-Mastery 
Treafcnent 2 


Differences 


,High . 


40.82 


33.25 


9.56* 


Middle 

✓ 


" 29.67 . 


20.82 


8.85* 


Low 


' 16.67 ' 


14.86 


1.81 . 



*Significant at the .05 level of significance. 



Application of the Bonferroni t test yielded a critical value* of 
5.78. Therefore, a difference as large as 5.78 was significant. The 
results of testing the hypotheses follow. 

The null hypothesis that, with respect to retention, there is no 
significant difference between the mastery and non-mastery treatment 
means for high aptitude students was rejected in favour of the alter- 
native hypothesis. The delayed posttest treatment mean for high apti- 
tude mastery students was significantly higher than the dejayed post- 
test treatment mean for high aptitude non-mastery students. 

The null hypothesis that, with respect to retention, there is no 
significant difference between the mastery and' non-mastery treatment 
means for middle aptitude students was rejected in favour of the alter- 
native hypothesis. The delayed posttest treatment mean for high apti- 
tude mastery students was significantly higher than the delayed post- 
test treatment -mean for middle aptitude non-mastery students. 

00130 
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The null hypothesis .that, with respect to retention, there is no 
significant difference between the mastery and non-mastery treatment 
means for low aptitude studerits was not rejected. Low aptitude mastery 
treatment students did not differ significantly from low aptitude non- 
mastery treatment students on the delayed posttest measure. 
19, 20, 21. Simple Effects ; Times- to- Testing 
V.113 = H.123 
W.223 

.V.313 = ^.323 ' 
These null hypothesis state, with respect to times-to-testing, 
that at each aptitude level there is no significant difference between 
the mastery and non-mastery treatment means. 

s The Bonferroni t test of significance was used to test for signi- 
ficance between treatments across each level of aptitude. See the 
description of the Bonferroni t test on p. 113. The cell means for 
treatments across each level of aptitude are presented in Table 5.16. 

Tabled. 16 

Times-to-Testing: Cell Means in Minutes and Differences 
for Treatments Across Each Level of Aptitude 



Mastery 
Treatment 1 


Mastery 
Treatment 2 


Differences 


High 


537.20 


477.67 


59.52* 


Middle 


556.46 


477.55 


78.91* ' 


Low 


561.33 


472.96 


88.38* 


♦Significant at t 


he .05 level of significance. 
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Application of the Bonferroni t test yielded a critical value of 
17.80. Therefore, a difference as large as 17.80 was -significant. 
The results of testing the hypotheses follow. 

The null hypothesis that, with respect to times-to-testing, there 
is no significant difference between the mastery and non-mastery treat- 
ment me^ns for high aptitude students was rejected in favour of the 
alternative hypothesis.- The times-to-cesting mean score for high 
aptitude mastery students was significantly greater than the times-to- 
testing mean score for high aptitude non-mastery students. 

The null hypothesis that, with respect to times-to-testing, there 
is no significant difference between the mastery and non-mastery treat- 
ment means for middle aptitude students, was rejected in favour of the 
alternative hypothesis. The times-to-testing mean score for middle 
aptitude mastery students was significantly greater than the times-to- 
testing mean score for middle aptitude non-mastery students. 

The null hypothesis that, with, respect to times-to-testing, there 
is.no significant difference between the mastery and non-mastery treat- 
ment means for low aptitude students was 'rejected in favour of the 
alternative hypotheses. The times-to-testing mean score for the low 
aptitude mastery students was significantly greater than the times-to- 
testing mean score for the low aptitude non-mastery students. 

' Discussion of the. Findings 

This study found that differences between aptitude levels were 
increased rather than diminished when self-instructional materials were 
used. High aptitude students learned and retained more of the geography 
unit than middle or low aptitude students, while middle aptitude stu- 



ERIC 



00132 



Table 5.17 



119 



Summary of Multivariate and Univariate Tests 
- of Significance: Interaction and Main Effects 



Statistical Hypotheses (Null) 


F 


Level of 
Significance 


There are no differences: 






I. Between vectors (MANOVA) of 
learning, retention, and times-to- 
testing; 

:1. Interaction: treatment by aptitude 
2. Main Effects: treatment 
o. nai n -t'TTects . aptituae 


' / 

l\ 

1.02 
14.82 
1'4 99 

2.87 * 
2.99 
J 56.39 


N.S. 
.001 
.001 


II. Learning (ANOVA): mean differences 
' for interaction and main effects; 

4. Interaction: treatment by aptitude 

5. Main Effects: treatment 

6. Main Effects: aptitude 


N.S. 
N.S. 
N.S. 


III. Retention (ANOVA): mean difference 
for interaction and main effects; 

7. Interaction: treatment by aptitude 

8. Main Effects: treatment 

9. Main Effects: aptitude 


' 2.07 
16.28 
66.74 


N.S. 

.05 

.05 


IV. Times- to-Testing (ANOVA): mean . / 
differences for interaction and main 
effects ; , 

10. Interaction: treatment by Aptitude 

11. Main Effects: treatment 

12. Main Effects: aptitude 


0.34 
26.60 
0.19 


N.S. 
.05 
. N.S. 
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Table 5.18 



Summary of Tests of Significance for Simple Effects: 
Comparisons ofj-Aptitude Levels Across Treatments 



Statistical (Null)' 
Hypotheses 


i 

Mean Score 
Mastery Treatment 


Mean Score 
Non-Mastery 
Treatment . 


Mean 

Difference 


Level of 

Signif- 

cance 


There are no 
differences: 








> 


I. Learning: 
treatment means 
across aptitude 
levels. 
13) High 
(14) Middle 

U^i. LOW 

(Simple Effects 
of II Table 
5.17) • 


38.28 
30.26 

"15* 99 


34.40 
23.83 
17 56 

f 


•3.87 « 
6.43 
-2.34 , 


N.S. 

.05 

N.S. 


H. Retention: 
treatment means 
across aptitude 
levels. 

(16) High 

(17) Middle - 

(18) Low 
(Simple Effects 
of III Table 5.17) 


40.82 
29.67 
16.67 


33.25 
20.82 
14.86 


7.56 
'8.85 
1.81 


.05 
.05 
.05 


III. Times -to- 
Testing: treat- 
ment means 
across aptitude 
levels. 

(19) High 

(20) Middle 

(21 ) low 
(Simple Effects 
of IV Table 
5.17) 


537.20* 
556.46* 
561.33* 


477.67* 
477.55* 
472.96* 


■59.53* 
78.91* 
88.38* 


.05 
.05 
.05 



♦Expressed in minutes 
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dents learned and retained more of the geography unit than low aptitude 

students. These results suggest that achievement was a function of 

the capacities and talents for learning that students of varying apti- 

L * ■ * 
tude brought to the instruction. 

The analysis of simple effects of treatments across each level of 
aptitude found that the mastery treatment facilitated greater retention 
for the high and middle aptitude students, and greater learning for the 
middle aptitude students. This was accomplished due to the feedback 
correction procedures required of the mastery students and the*increased 
time that these* procedures required of the mastery students for re- 
learning. This result is consistent with that of Fishburne (1971) who 
used a programmed and non-programmed text* He found that exposure to 
the programmed text increased learning and retention but took more time 
across levels of reading. He attributed increased student learning to 
the extra time taken with the materials. Therefore, it would appear 
that self-instructional materials at least facilitate retention for" 
students of high and middle aptitude students; However, the mastery 
procedures did not facilitate learning and retention for low aptitude 
.students . 

Low aptitude mastery students neither learned nor retained the 

/ 

•geography material more than low aptitude non-mastery students. The 
low aptitude students used in this study obtained very low reading 
scores as measured by the Iowa Tests of Basic Skills . Mien converted 
to grade equivalent scores the low aptitude mastery and non-mastery 
students were reading at approximately fourth grade level. This is 
almost four grade levels below actual classroom level (see Tables £.3, 
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4.'4, and 4.5 in Chapter Four) and at least two grades below the Grade 
6 reading level of the materials Functions of Cities used in the study 
(see p. 55 and Table 3,2 for discussion of readability). Therefore, 
the lack of differences between the low altitude mastery and non-mastery 
students can be explained by the lack of verbal facility that low apti- 
tude students brought to instruction. This was particularly evident 
in the scores obtained on the 40 item multiple choice and the 24 item 
recall chart tables appearing in Chapter 3. The low aptitude mastery 
and non-mastery students consistently scored lower than the middle and 
high aptitude groups on the 40 item multiple choice test and often did 
not start the 24 item recall test (see Tables 3.4. and 3.5). This 
strongly suggests that the strength of learning by low aptitude students 
wa,s indeed low. Another factor that reinforces this position is that 

i 

there was only a one chapter difference between high and low aptitude 
students at the completion of instruction. This suggests that low 
aptitude students did not spend the necessary time in relearning the 
material necessary to improve their learning. The difficulty of the 
material due to their inherent reading and vocabulary deficiencies 
probably caused frustration in learning and. reduced their task orienta- 
tion. Therefore, the materials Functions of Cities were probably too 
difficult for low aptitude students. 

The review of the nine studies comparing" 1 mastery to non-mastery 
strategies revealed that two wert below the college level, three used 
self-instructional materials, and none used social. science materials. 
Within this context, all studies reported that mastery facilitated 
learning more than a non-mastery treatment. The emphasis of research 
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was at the university or college level where students used could not 
be considered a representative sample of normal classroom conditions. 

The- results of the present study indicate that when self-instruc- 

\ " 

tional mastery procedures are used they do not facilitate greater post- 
test average performance than non-mastery procedures. The findings 



are contrary to\Moore, Mahan and Ritts (1968), Green (1969), and 
Gentile (1970). These researchers used self-paced procedures. However,^ 
they, used content that is sequential by nature (math and science 
content) and each learning task was contiguous with the next. This 
study used geography materials organized in a specific sequence devis'ed 
by the researcher. ^However, the materials were constructed and 
organized around two major generalizations and this scheme was followed 
: through each of the chapters . ' The results of the present stu'dy apply 
to'the materials and students in this study' but it is reasonable to 

s 

suppose that similar results .would be obtained if the same materials 
were used with students who contained similar contextual characteristics. 

The literature concerning retention (Block, 1970; Kersh, 1970; 
Romberg, Shepler, and King, 1970; and Wentling, 1970) found that 
.retention is facilitated when group-paced instruction is used with 
correction and feedback. This study found that when self-instructional 
geography materials were used mastery procedures facilitated greater 
retention than non-mastery procedures as measured by the delayed post- 
'test. Therefore, this would suggest that the correction-feedback 
procedures, either group-paced or self-instruction; facilitated greater 
retention of original learning. 

The literature review showed that only two studies reported the 

i _ ' ■ 
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time variable (Merrill , 
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Barton, and Wood, 1970; Block, 1970), Both 
earning became increasingly efficient over a 



studies Indicated that 

Series of sequenced learning units in class-paced instruction- This 

' i 

study, did not support these findings, Masjtery students used considera- 

I ** 
bly more time to learnfthe material than non-mastery students. These 

/time differentials al^b increased when comparisons were made between 

aptitude levels. Therefore, the results of this study would suggest 



that self-paced mastery 
non-mastery instruction 



instruction requires more time than self-paced 



or class-paced instruction, 

x ' • 
This chapter has presented the findings of the study for each of 



the statistical hypotheses and has discussed some of the implications. 
The next chapter provides a summary of the study, introduces some 
educational implications!, arid recommends areas for further research. 
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CHAPTER VI 

SUMMARY, DISCUSSIONS, AND RECOMMENDATIONS 

Summary \ 
- This study was conducted under the sponsorship of theuSeography 
Curriculum Project of the University, of Georgia. The purposb of the 
study was to determine the effects of a self-instructional mastery 
procedure upon the average achievement of students of varying aptitudes 
using measures of learning, retention, and times-to-testing. 
Research Hypotheses 

The major purpose of this study was to compare self-instructional 
mastery and non-mastery treatments to determine if there were dif- 
ferences in learning, retention and 1;ime-to-tescing of high, middle 
and low aptitude students. 

The following research hypotheses were investigated. 

1. The mastery and non-mastery treatments will produce differences 
in the average affects which are not the same (p<.05) at the high, 
middle, and low aptitude levels measured by posttests of: 

(a) learning 

(b) retention 
•and a measure of, 

(c) times-to-testing 

2. With pupils pooled across the three 3cvets of aptitude the 
difference between the mastery and the non-mastev-y treatments, will pro- 

125 

00139 



126 

duce differences (p<-05) in the average achievement measured by geo- 
graphy posttests of: 

(a) learning " 

(b) retention 
and a measure of, 

(c) times- to-testing - ^ 

3.. With pupils pooled across the two treatments, there- are dif- 
ferences among the three level s-of«aptitude vectors of average effects 
(p<.05) measured by geography posttests of 

(a) learning 

(b) retention 
and a measure of 

(c) times-to-testing 

Procedures 

A geography unit titled Functions of Cvties was developed by the 
researcher. The self -instructional unit consisted of a student text 
and two forms of the student workbook. Two treatments were devised. 
The non-mastery treatment (T 2 ) received the student text and a work- 
book. The workbook contained prescribed activities and a single review 
test for each chapter. Students worked through both. The mastery 
treatment (T^ received the same student text but the workbook varied. 
Each chapter of the workbook contained two review tests. If the 
criterion level was not attained in the first review test, mastery 
students were required to correct and relearn material and then take 
a second review test. 

Two basic concepts of urban geography used in relations to cities, 
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function and economic base, were identified as the major jthemes of 



generalizations and facts were recorded iin a table of specifications 
which was used in the construction of the measuri,rig instruments. 

A 40 item multiple choice test and a 24 i'tem. recall test was 
developed by the researcher/ to collect data to measure students' per- 



formance for the experiment. Both tests were used to measure learning 
and retention of the qontent materials. The retention measure was 



Twenty grade seven classes from the Savannah-Chatham County 
School District served as the experimental population. Treatments 
were randomly assigned to classes in each school. All subjects were 
administered the, word meaning section of the Iowa Tests of Basic Skills' ; 
Form 5 and 6 (Lindquist and Hieronymus, 1971). Students within the 
20 classes were then placed in three levels of aptitude. Classes were 
then randomly assigned to two groups and treatment was randomly 
assigned to groups. 

Because individual classes were the smallest units of independence, 
class should have been the smallest unit of analysis. However, because 
this study focused upon aptitude groups within class, the aptitude 
group mean was used as the analysis unit. The mean was obtained from 
the unequal Ns for each of the sixty cells. A 3 x 10 x 2, aptitude by 
classes-nested-wi thin-treatments, by treatments, multivariate analysis 
of variance was used to compare the differential effects of two 
treatments across three levels of aptitude. 



these project materials. The two major concepts 




administered 17 daysAfter the conclusion of instruction. 



/ 
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Findings , 

The findings of the investigation were reported separately for 
each of the statistical hypotheses used to test the research hypotheses. 
The research hypotheses- were intended to establish whether self- 
instructional , - mastery procedures reduced differences in achievement 
of high, middle, and low aptitude students, as measured by tests of 
learning, retention, and times-to-testing. 

This study found that differences in aptitude were not reduced 
-when- self-instructional materials were used. The findings are 
reported, more specifically, for. interaction of treatment and aptitude, 
in terms of the main effects (treatment and aptitude), and simple 
effects of aptitude levels acrors treatments for learning, retention, 
and times-to-testing. *" 
Findings of the ' Treatment bv Aptitude Interaction 

No significant interactions between treatment and aptitude levels 
.were found on the 1earni^g\ retention, and times-to-testing measures. 
Treatment and aptitude were not acting together in this study. 
Findings Between Treatment Groups 

Students of high aptitude scored significantly higher than middle 
and low aptitude students as did students of middle aptitude over 
students of low aptitude on learning and retention. However, there 
were no differences on the times-to-testing between any of the aptitude 
levels. 

Findings of the Aptitude Levels Across Treatment : Simple Effects 

High and middle aptitude mastery treatment students retained 
more than high and middle aptitude non-mastery treatment students and 
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middle aptitude mastery students learned more than, middle aptitude 
non-mastery students. There was no difference between learning and 
retention for the low aptitude students across treatments. However, 
high, middle, and low aptitude non-mastery students used less time 
than high, middle, and low aptitude mastery treatment students. 
Discussion of Educational Implications 

The basic concerns of the researcher in this study were the 
effects that a sel f ^instructional mastery procedure had on students of 
varying aptitude when social science materials were used. Since the 
study found that the mastery procedrue did not facilitate learning 
and retention for low aptitude students the following suggestions 
would seem in order. 

The disadvantaged learner brings to the classroom many learning 
problems. It should be the teacher's and the school's responsibility 
to assist these students. Mastery procedures would appear to offer 
the disadvantaged student some hope of overcoming some of their 
environmental and hereditary learning deficiencies if a teacher is 
prepared to work closely with the student and to carefully monitor the 
mastery procedure at each level. The lack of teacher monitoring in 
administering the review tests may have contributed to the poor per- 
formance of the low aptitude students . The second review test for 
the mastery students can be a strong relearning tool if used correctly. 
The researcher did not request that the teachers monitor the retaking 
of the review test. The researcher believes that this led to only 
cursory examination of the learning material by all students and 
particularly low aptitude students. This is a weakness in the proce- 

c 
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dures used in this study and the researcher strongly recommends that 
|this be controlled for in subsequent studies of a similar design and 
nature to this one. While the results of this study do not support 
the use of self-instmctionaf mastery materials with the low aptitude 
student, class-paced mastery materials may operate more successfully 
with the slow learner. 

The lack of success by low aptitude students was also a function 
of the degree to which low aptftude students were task oriented. Typi- 
cally, low aptitude students require, xlose personal supervision by 
the teacher, frequent feedback, and' learning success. Stuempfig 
and Maehr (1970) found in a study concerning matching of materials 
and student characteristics, that low performing students performed 
better with personal rather than impersonal feedback. The low aptitude 
students, in this study, used self-instructional materials where all 
students responded independently to the learning exercises. As the 
low aptitude students performance, as measured by the geography 
achievement tejst, did not differ from chance to any great degree, 
this strongly suggests that self-instructional materials do not operate 
as well* with low aptitude students as they operated with middle and 
high aptitude students. 

The purpose of including the time measure \n the study was to 

* i, 

determine whether the use of correction feedback procedures which 
required more time facilitated learning across levels of aptitude. As 
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the correction-feedback procedures required that more time be spent by 
the mastery students it was expected that mastery students should have 
increased achievement. However, there were two disadvantages to this^ 
practice. First, the mastery students did not complete as much of the 
unit as non-mastery students. Therefore, the advantage of superior 
achievement must be weighed against the disadvantage of less work 
covered. The school must decide where its priority lies in this regard. 
Second, the learning of social science materials and other disciplines 
compete for a students learning time each day of his school life. In a 
society where success is most often measured by quantity rather than 
quality,/ schools may not be able to afford the extra time that a 
mastery /procedure appears to require. The economics of achievement as 
weighed' against extra time to attain quality of learning may not be 
compatible in today's schools. 

Recommendations for Further Research 
. Based on the findings and conclusions of the present study, the 
researcher submits the following specific recommendations for further 
systematic research relating to the affects of mastery on students of 

varying aptitude. 

The first recommendation for further research can be found in the 
threats to external validity which were inherent in the procedures and 
design of this study. The reactive arrangement of treatment was a 
possible limitation of the present study. Therefore, the following 
recommendations are made for further research: 

1. This study should be replicated in its present form using a 
larger number of schools, grade levels, and school systems: 
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Z\ This study should be replicated in its present form without 
the artificiality of an experimental setting and without the student's 
knowledge that he is involved in an experiment. 

The first recommendation 1 made above emphasizes the fact that the 
sample used in the experiment was drawn from a population of seventh 
, grade students of the Savannah-Chatham County School System.^ Thus, the 
findings of this study can only be generalized to similar populations 
that have similar characteristics. The second recommendation emphasizes 
the fact that the sample used in the present stiidy may have realized the 
experimental nature of their situation. Future research should control 
for this reactive arrangement. 

The third recommendation concerns the select of material and its 
implementation in the classroom. The materials Functions of Cities 
should be- used in a subsequent study where the administration of the 
materials are closely monitored by the researcher. This would overcome 
problems that Gaines (1971), Kim (1969) and this study encountered in • 
making the mastery treatment more potent. Ideally, the researcher should 
, live on site for the period of the study. Therefore, the following 
recommendations are made for further research: '• 

3. The unit Functions of Cities should be administered with 
greater researcher control and supervision to ensure that the differences 
between treatments is^enhanced. 

4. A study should be conducted where the review tests and answer 
sheets are distributed by the teacher when the student has demonstrated 
that he is ready to perform these tasks. 

This recpmmendationi would assist the teacher and the researcher to 
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more closely monitor the treatment invthe classroom. It was suspected 
in this study that the availability of\he review tests and the answer 
sheets may have contributed to the formation of slovenly learning 
habits. ' * 

' The fifth recommendation suggests that £he unit Function of Cities 
be used with various class levels and that a cla^s-paced procedure be 
devised to observe the effects of treatment acrossXaptitude levels'. 
Therefore, the foil owing., recommendation is made for further research: 

5. A class-paced procedure should be devised for various class 

1. 

levels to observe the effects of treatment across varying aptitude 
levels. J , 

This recommendation was made because of the disadvantage that low 
aptitude students confronted in this study. Closer personal student- ^ 
teacher contact may assist the low aptitude students to overcome some of 
their personal weaknesses such as poor vocabulary, poor understanding of 
the content, and frustration with the procedures. 
Summary of Recommendations 

The need for further research comparing self-instructional s* 
mastery procedures with self-instructional non-mastery procedures in 
student's performance as measured by tests of learning, retention, and 
times-tortesting has been demonstrated. 

The findings of the present study are general izable only to 
similar populations using similar instructional materials and measuring 
learning outcomes using similar measuring instruments to those used in 
the study. 

The suggestions for further research recommended previously are 
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beyond the capabilities of any single researcher working alone to. 
accomplish. A systematic comprehensive study of masttery is needed. 
This would entail a large scale, well coordinated team effort where 
individual investigators would each focus op a single task or variable 
yet coordinate his research with that of his colleagues.^ A trend 
beginning with this study, has begun at the University of Georgia where 
a series' of studies have been planned. It is strongly believed by this 
researcher that such a group effort is needed, not only for research in 
the broad spectrum of mastery; but in the many aspects of investigating 
theories and practices in education. 
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APPENDIX A 



Student Text: Functions of Cities 
used by both Mastery and Non- 
Mastery Treatment Groups 



A complete set of the unit Functions of Cities 
may be ordered from the Geography Curriculum 
Project, 107 Dudley Hall, University of Georgia 
Athens, Georgia 30602. 
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APPENDIX B 



Student Workbook for the Non-Mastery Treatment Group 
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APPENDIX C 



Student Workbook for the Mastery Treatment Group 
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APPENDIX D 



List of Major Facts and Concepts to be Learned 
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List of Major Facts and Concepts to be Learned 



I. Facts: The facts are too numerous to mention here - See 

% 

Appendix A (STUDENT TEXT), which incorporates the facts 
to be learned. 



II. Concepts: The majority of the concepts to be learned were listed 
and briefly defined- or described in the glossary. The 
glossary can be found at the end of the Student Text* 
pp. 10.1-10.6. 
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APPENDIX E 



Table of Specifications for Achievement Tests 
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table of Specifications: 40 Item Multiple Choice Achievement Test 



unapter content 
Blocks 


Knowledge 


Ann! l rati on -or 

Transfer 


Total 


K I » economic uctse 
and function 


1 ? ?Q 

7.5% 


3 

2.5% • 


10% 


uurDan: rort 
City 


5% 


6,8 

5% : 


10% 


3. Frankfurt: commercial 
City 


in n 1? 13 
10% 


14 

2.5% 




12.5% 


4. rittsDurgn: industrial 
City 


-7 1R Ifi 17 18 

12.5% 




12.5% 


5. Brasilia: Government 
City 


on oo 0*3 

7.5%' 


21,24 

5% 


12.5% 


6. Surfers paradise. 
Resort City 


97 9R 31 

10% 


30,33 


15% 


7. Benares: Religious 
City 


32,34,35 

7 R<£ 




7.5% 


8. Mexico City: Dominant 
Citv 


37,38 

5% 






5% 


9. Tokyo: Super City 


39,40 

5% 






5% 


General Terms \^ 


9,25,36 

7.5% 


19 

2.5% 


10% 


Total 


\ 77.5% 


22.5% 


100% 
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The 24 item recall test items were all knowledge items based on the 

content available in each chapter of the unit Functions of Cities . 

# 

i • 

See Appendix F for a copy of the recall test. 
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APPENDIX F 



The 40 Item Multiple Choice and the 24 Item 
Recall Geography Achievement Test 
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APPENDIX G 



'Report from Teachers' Weekly Report Form 
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APPENDIX H • 
The following forms were used: 

1. Directions to Teachers: Non-Mastery and Mastery 
Instructions. 

2. Teacher Information Sheet. 

3. School Characteristics Information Sheet. 



00166 



